Jekyll2019-10-24T11:33:16-04:00https://rwbogl.github.io/feed.xmlRobert Dougherty-BlissA place, perchance, to dreamRobert Dougherty-Blissrobert.w.bliss@gmail.comArranging intervals2019-07-21T00:00:00-04:002019-07-21T00:00:00-04:00https://rwbogl.github.io/arranging-intervals<p>Suppose that we have a collection of $n$ tasks which can be started or stopped
at any of $k$ possible points in time. For instance, if we have the tasks <code class="language-plaintext highlighter-rouge">a</code>
and <code class="language-plaintext highlighter-rouge">b</code> with 4 distinct points in time, here are the possible ways to arrange
the tasks to take up all 4 blocks of time:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>a-a-b-b
a-b-a-b
b-a-a-b
a-b-b-a
b-a-b-a
b-b-a-a
</code></pre></div></div>
<p>The string <code class="language-plaintext highlighter-rouge">a-a-b-b</code> denotes that task <code class="language-plaintext highlighter-rouge">a</code> starts at the first time slot,
finishes in the second, and is then replaced by task <code class="language-plaintext highlighter-rouge">b</code> for two time slots.
The string <code class="language-plaintext highlighter-rouge">a-b-a-b</code> denotes that task <code class="language-plaintext highlighter-rouge">a</code> begins, task <code class="language-plaintext highlighter-rouge">b</code> begins while <code class="language-plaintext highlighter-rouge">a</code> is
ongoing, task <code class="language-plaintext highlighter-rouge">a</code> finishes, then task <code class="language-plaintext highlighter-rouge">b</code> finishes.</p>
<p>If we require that the $n$ tasks take up <em>all</em> $k$ of the time slots, then the
six possibilities above are <em>all</em> of the possibilities for $n = 2$ and $k = 4$.
If we had fewer time slots or more tasks, then there would be some overlap. We
write two tasks in the same time slot by juxtaposing the task names, such as
<code class="language-plaintext highlighter-rouge">ab</code>. For example, filling two time slots with two tasks <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> can only
be written as <code class="language-plaintext highlighter-rouge">ab-ab</code>. We do not distinguish this from <code class="language-plaintext highlighter-rouge">ba-ba</code> or <code class="language-plaintext highlighter-rouge">ba-ab</code>.</p>
<p>Let’s agree to call these “tasks” <em>intervals</em>, in the sense that they represent
intervals of time between points. The intervals we have seen so far all have
two points. In an analogous way, we could discuss intervals with three, four,
or any number of points. For example, if tasks <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> occupy intervals of
length 3, then we could arrange them into 4 slots as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>a-ab-ab-b
</code></pre></div></div>
<p>Here is a general question about intervals:</p>
<blockquote>
<p>How many ways can you arrange $n$ intervals of length $s$ into $k$ slots?</p>
</blockquote>
<p>Let $\lambda_s(k, n)$ be the number of arrangements of $n$ intervals of length
$s$ into $k$ slots. We shall prove the following result about $\lambda_s$.</p>
<p><strong>Theorem.</strong> For each positive integer $s$, $\lambda_s$ satisfies the
$(s + 1)$-term recurrence</p>
<script type="math/tex; mode=display">\begin{equation}
\label{lambda-recurrence}
\lambda_s(k, n) = {k \choose s} \sum_j {s \choose j} \lambda_s(k - j, n - 1)
\end{equation}</script>
<p>with initial conditions $\lambda_s(k, 0) = [k = 0]$. Further, $\lambda_s$ may
be expressed as the sum</p>
<script type="math/tex; mode=display">\begin{equation}
\label{lambda-sum}
\lambda_s(k, n) = \sum_j {k \choose j} (-1)^{k - j} {j \choose s}^n.
\end{equation}</script>
<p><strong>Proof of recurrence.</strong> Single out the $n$th interval. For some $j$ in $[n]$,
this interval must occupy exactly $j$ slots where no other task is present. If
we remove those points along with all occurrences of the $n$th interval itself,
we are left with an arrangement of $n - 1$ intervals onto $k - j$ points, of
which there are $\lambda(k - j, n - 1)$. There are ${k \choose j}$ ways to
choose the initial $j$ points which the $n$th interval occupies alone, and then
${k - j \choose s - j}$ ways to place the remaining $s - j$ points. This means
that each $j$ contributes exactly</p>
<script type="math/tex; mode=display">\lambda_s(k - j, n - 1) {k \choose j} {k - j \choose s - j}</script>
<p>arrangements to the total. Therefore</p>
<script type="math/tex; mode=display">\lambda_s(k, n) = \sum_j \lambda_s(k - j, n) {k \choose j} {k - j \choose s - j}.</script>
<p>Our recurrence follows from the well-known binomial coefficient identity</p>
<script type="math/tex; mode=display">{k \choose j}{k - j \choose s - j} = {k \choose s} {s \choose j}. \quad \blacksquare</script>
<p>To prove the summation identity we need a lemma about falling factorials and
exponential generating functions.</p>
<p><strong>Lemma.</strong> If $f$ is the exponential generating function (egf) of the sequence
$a(n)$, then the egf of $n^{\underline{m}} a(n)$ is $x^m D^m f$, where $D$ is
the differential operator.</p>
<p>The lemma’s proof is a routine computation.</p>
<p><strong>Proof of summation identity.</strong> Let $a_n(k) = \lambda_s(k, n)$ and</p>
<script type="math/tex; mode=display">f_n(x) = \sum_{k \geq 0} \frac{a_n(k)}{k!} x^k</script>
<p>be the exponential generating function of $a_n(k)$ in $k$. Further let</p>
<script type="math/tex; mode=display">g_n(x) = f_n(x) e^x = \sum_{k \geq 0} \frac{b_n(k)}{k!} x^k</script>
<p>be the egf of the binomial transform of $a_n(k)$. The coefficients $a_n(k)$ and
$b_n(k)$ are related via the binomial transform, so knowing either one tells us
the other. Using this, we will instead find $b_n(k)$.</p>
<p>Taking the egf of both sides of \eqref{lambda-recurrence} yields, by our lemma,</p>
<p>\begin{equation}
\label{egf-eqn}
f_n = \frac{x^s}{s!} \sum_j {s \choose j} D^{s - j} f_{n - 1}
\end{equation}</p>
<p>Fortunately, $g_n$ satisfies the miraculous identity<sup id="fnref:g-identity"><a href="#fn:g-identity" class="footnote">1</a></sup></p>
<script type="math/tex; mode=display">D^s g_n = e^x \sum_j {s \choose j} D^{s - j} f_n,</script>
<p>so multiplying \eqref{egf-eqn} by $e^x$ yields</p>
<script type="math/tex; mode=display">g_n = \frac{x^s}{s!} D^s g_{n - 1}.</script>
<p>This does not have an easy solution for the egf $g_n$ itself, but it does for
its coefficients. For every nonnegative integer $k$, the coefficient on $x^k
/ k!$ in the left-hand side is $b_n(k)$. The right-hand side is</p>
<script type="math/tex; mode=display">\frac{x^s}{s!} D^s g_{n - 1}
= \frac{x^s}{s!}
\sum_{j \geq 0} \frac{(j + s)^{\underline{s}}}{(j + s)!} b_{n - 1}(j + s) x^j,</script>
<p>so the coefficient on $x^k / k!$ here is</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\frac{k!}{s!} [x^{k - s}]
\sum_{j \geq 0} \frac{(j + s)^{\underline{s}}}{(j + s)!} b_{n - 1}(j + s)
&= \frac{k!}{s!} \frac{k^{\underline{s}}}{k!} b_{n - 1}(k) \\
&= {k \choose s} b_{n - 1}(k).
\end{align*} %]]></script>
<p>Therefore,</p>
<script type="math/tex; mode=display">b_n(k) = {k \choose s}^n b_0(k).</script>
<p>To compute the remaining $b_0(k)$, note that it comes from the binomial
transform:</p>
<script type="math/tex; mode=display">b_0(k) = [x^k / k!] f_0(x) e^x = \sum_{j = 0}^k {k \choose j} a_0(k) = 1,</script>
<p>So the $b_0(k)$ factor washes out and we get</p>
<script type="math/tex; mode=display">b_n(k) = {k \choose s}^n.</script>
<p>Inverting the binomial transform now yields</p>
<script type="math/tex; mode=display">a_n(k) = \sum_j {k \choose j} (-1)^{k - j} {j \choose s}^n,</script>
<p>which is exactly what we claimed. $\blacksquare$</p>
<h1 id="using-the-closed-form">Using the closed form</h1>
<p>Despite the intimidating appearance of \eqref{lambda-sum}, we can obtain some
neat information from it.</p>
<p>For fixed $k$ and $s$, equation \eqref{lambda-sum} is a proper closed form. The
variable $n$ only appears as a power in \eqref{lambda-sum}, never as
a summation limit or binomial coefficient variable. For example, if $a_{s,
k}(n) = \lambda_s(k, n)$, then</p>
<script type="math/tex; mode=display">a_{3, 4}(n) = 4^n - 1</script>
<p>for all $n$.</p>
<p>We can also immediately see the ordinary generating function (ogf) of the
sequence $a_{s, k}(n)$:</p>
<script type="math/tex; mode=display">\begin{equation}
\label{gf}
\sum_{n \geq 0} a_{s, k}(n) x^k =
\sum_j \frac{ {k \choose j} (-1)^{k - j}}{1 - {j \choose s} x}.
\end{equation}</script>
<p>This tells us to expect the ogf to be some rational function where the
numerator is confusing and the denominator is the product of $(1 - {j \choose
s} x)$ for all $j$. In particular, for $s = 2$ we should observe the triangular
numbers.</p>
<p>Since the generating function for $a_{s, k}(n)$ is rational, we know that our
sequence is C-finite: it satisfies a linear recurrence with constant
coefficients. We don’t know what the recurrence is, but we know that it exists
and could probably find it if we wanted.</p>
<p>The generating function in \eqref{gf} has poles at $x = {j \choose s}^{-1}$ for
$s \leq j \leq k$. If we assume that $k \geq s$ (a perfectly reasonable
assumption), then the smallest pole is $x = {k \choose s}^{-1}$. This means
that the radius of convergence of our generating function is ${k \choose
s}^{-1}$, and so</p>
<script type="math/tex; mode=display">\limsup_n \sqrt[n]{a_{s, k}(n)} = {k \choose s}.</script>
<p>In other words, for every $\epsilon > 0$ we have</p>
<script type="math/tex; mode=display">a_{s, k}(n) \leq \left( {k \choose s} + \epsilon \right)^n</script>
<p>for sufficiently large $n$.</p>
<p>This is quite a bit of information to learn from such a gnarly-looking sum!</p>
<h1 id="notes">Notes</h1>
<p>The sequence $\lambda_2(k, n) = \lambda_{2, k}(n)$ is
<a href="http://oeis.org/A059117">A059117</a> in the OEIS. I stumbled upon it while
working on a seemingly unrelated problem about lines. I thought that the
$\lambda_s$ numbers might be related to a different set of sequences sequence
somehow. In retrospect I’m certainly wrong, but they’re cool anyway. I was
helped immensely in the special case $s = 2$ by Michael Somos’s answer to <a href="https://math.stackexchange.com/questions/3288280/">my
Math.SE</a> question. This is
all really just generalizing his answer and playing with the result.</p>
<div class="footnotes">
<ol>
<li id="fn:g-identity">
<p>This is routine to verify by induction. <a href="#fnref:g-identity" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Robert Dougherty-Blissrobert.w.bliss@gmail.comSuppose that we have a collection of $n$ tasks which can be started or stopped at any of $k$ possible points in time. For instance, if we have the tasks a and b with 4 distinct points in time, here are the possible ways to arrange the tasks to take up all 4 blocks of time:The futility of debate2019-06-17T00:00:00-04:002019-06-17T00:00:00-04:00https://rwbogl.github.io/the-futility-of-debate<p>I recently watched <a href="https://www.youtube.com/watch?v=DzLD7fGtjyg">the final debate of William Buckley’s <em>Firing
Line</em></a>, filmed in 1999. <em>Firing
Line</em>, and perhaps this farewell episode in particular, demonstrates a level of
companionship and vigor that we rarely see in public debate in the United
States. But even with this in mind, it also demonstrates a fundamental problem
with debates: Debates are too adversarial to explore nuance, leading to
simplified solutions and misunderstandings of the problem at hand. Complicated
problems sometimes need complicated solutions, and a debate will always miss
this. I want to focus on one small exchange in this episode that typifies this
problem.</p>
<p><em>Firing Line</em> debates are formal and focused on a particular resolution. In
this episode, the resolution is “the federal government should not impose a tax
on electronic commerce.” We will focus on an exchange between Senator Wyden and
Professor Fox on this issue. Senator Wyden’s opening statement begins
<a href="https://www.youtube.com/watch?v=DzLD7fGtjyg&feature=youtu.be&t=2953">here</a>.
The exchange itself begins <a href="https://youtu.be/DzLD7fGtjyg?t=3174">here</a>.</p>
<p>Reading from a prepared sheet, senator Wyden announces:</p>
<blockquote>
<p>In a survey of 1,500 mainstreet business districts nationwide, 74% have gone
online since 1997. Because of that, many small businesses that had been on
the verge of collapse, threatened by mega malls, retail giants, and mail
order companies, are now thriving, and economically depressed towns that had
been losing population are now growing.</p>
</blockquote>
<p>After a few moments of back-and-forth, Professor Fox raises his main objection:</p>
<blockquote>
<p>Senator, I’m sure that you’re aware that 75% of the internet activity that
takes place occurs within just 50 firms, so the notion that this is
broadening out to large numbers of firms is simply inconsistent with the
facts of the situation.</p>
</blockquote>
<p>Senator Wyden retorts:</p>
<blockquote>
<p>So you’re saying that the ABC study of 1,500 mainstreet districts is wrong? I
think it’s clear.</p>
</blockquote>
<p>Professor Fox clarifies his position, but the issue is dropped for time.</p>
<p>Before explaining what went wrong here, let’s hit the theoretical problem.
Debate is, at its core, a method to uncover truth. You present your points,
I try to refute them, then we exchange roles. This is supposed to work like
a crucible for contrasting ideas. The stronger argument will convince the
audience, taking us one step closer to discovering the “truth,” or “right
answer.”</p>
<p>The problem is that debate can only paint in broad strokes. Given two ideas, it
can vote for one or the other. It cannot blend two together. For example, what
if your point is comprised of five or six smaller points? If I take your main
point, do I have to take the smaller ones as well? What if I changed your mind
on one of those points? Can we modify our positions or reach a compromise? In
a debate, the answer to these questions is always the most restrictive
possible. You must take all of the smaller points, we cannot explore them in
detail, and we certainly cannot modify our positions. In short, debate can
never say “you’re both right.”</p>
<p>Let me summarize the exchange between Wyden and Fox:</p>
<ul>
<li>
<p>Wyden: I care about the success of small businesses, and 74% of business have
adopted internet commerce.</p>
</li>
<li>
<p>Fox: The adoption rate does not equal the success rate, and there is good
evidence that the success rate is much lower than 74%.</p>
</li>
<li>
<p>Wyden: My evidence is foolproof. If you doubt my position, then you doubt my
foolproof evidence, therefore you are wrong.</p>
</li>
</ul>
<p>If Wyden’s concerns for the little guy are genuine, then Fox’s objection should
cause concern. Maybe the internet <em>isn’t</em> as good for small businesses as we
thought! We should talk about it more a find out what we mean by “good for
small businesses.” Is the “success rate” what we care about? If so, how do owe
measure it? What rate is “good enough”? Whatever the case, the “truth” must
clearly take these questions into consideration. If the men were interested in
devising a real solution and actually understanding the issue, then they would
have stopped to discuss these points. But because Wyden and Fox are in
a debate, they are only interested in winning. Wyden makes an objection that
completely misunderstands Fox’s point, and the two never return to the subject.
The nuance is lost.</p>
<p>It is a travesty to lose valuable discussions because of a bad format. Had Fox
and Wyden been interested in conversing, we would have been privy to a much
more interesting conversation. We would have seen real disagreement and
progress towards understanding. Instead, they had to read from their prepared
sheets and hope that their argument pandered to the audience enough to win.</p>
<p>Avoiding this problem is a major benefit of longer, less structured
discussions. Podcasts sometimes have great examples of these conversations “in
the wild.” Shows like the <a href="http://podcasts.joerogan.net/">Joe Rogan Experience</a>
and <a href="https://samharris.org/podcast/">Making Sense with Sam Harris</a> successfully
cover a pretty wide variety of topics in depth because there is no pressure to
“win.” There is only a conversation to be had. You can’t dodge questions and
you can’t appeal to an audience—you’re just having a “regular” conversation.
These feel closer in spirit to a <a href="https://en.wikipedia.org/wiki/Socratic_dialogue">Socratic
dialogue</a> than any public
debate I’ve ever seen.</p>Robert Dougherty-Blissrobert.w.bliss@gmail.comI recently watched the final debate of William Buckley’s Firing Line, filmed in 1999. Firing Line, and perhaps this farewell episode in particular, demonstrates a level of companionship and vigor that we rarely see in public debate in the United States. But even with this in mind, it also demonstrates a fundamental problem with debates: Debates are too adversarial to explore nuance, leading to simplified solutions and misunderstandings of the problem at hand. Complicated problems sometimes need complicated solutions, and a debate will always miss this. I want to focus on one small exchange in this episode that typifies this problem.Numerically verifying properties of random permutations2019-05-28T00:00:00-04:002019-05-28T00:00:00-04:00https://rwbogl.github.io/numerically-verifying-properties-of-random-permutations<p>According to some generating function magic, the proportion of permutations on
$n$ letters which have cycles only of lengths divisible by a fixed integer $k$
is <script type="math/tex">\frac{(1/k)^{\overline{n/k}}}{(n/k)!},</script> where the overline means <a href="https://en.wikipedia.org/wiki/Falling_and_rising_factorials">“rising
factorial”</a>. I
will prove this, but then I will <em>actually</em> demonstrate that it is true, using
computational methods. Empirical verification carries a certainty that a “mere
proof” cannot!</p>
<h1 id="background">Background</h1>
<p>In <a href="https://www.math.upenn.edu/~wilf/DownldGF.html"><em>generatingfunctionology</em></a>,
Wilf gives a very clear discussion of the exponential formula. This formula
tells us how to count structures built from smaller, “connected” structures.
Permutations fit this mold exactly. Every permutation is the product of
disjoint cycles, which we can think of as being “connected.” There is a lot to
say here, but we can just quote two special cases of the result:</p>
<p>Let $d_n = (n - 1)!$ be the number of cycles which permute exactly $n \geq 1$
letters, and</p>
<script type="math/tex; mode=display">D(x) = \sum_{n \geq 1} \frac{d_n}{n!} x^n = -\log(1 - x)</script>
<p>be the <a href="https://en.wikipedia.org/wiki/Generating_function">exponential generating
function</a> of this sequence.
Let $h_n = n!$ be the number of permutations on $n$ letters (which may touch
fewer than $n$ elements), and</p>
<script type="math/tex; mode=display">H(x) = \sum_{n \geq 0} \frac{n!}{n!} x^n = \frac{1}{1 - x}</script>
<p>be the exponential generating function of this sequence. The exponential
formula tells us that</p>
<script type="math/tex; mode=display">H(x) = e^{D(x)},</script>
<p>which is easy to check in this case. This does not seem so impressive, but it
is far more useful when we do not know one of the two sides.</p>
<h1 id="a-proof">A proof</h1>
<p>It turns out that the exponential formula applies in broader situations than
what we have described so far. In particular, we can use it to prove the result
stated at the beginning of this post. For a fixed integer $k$, we will apply
the exponential formula in a subtler way.</p>
<p>Let $d_n$ be the number of cycles which permute exactly $n$ letters, <em>but have
length divisible by $k$</em>. That is,</p>
<script type="math/tex; mode=display">d_n = (n - 1)! [k \backslash n],</script>
<p>where the brackets are <a href="https://en.wikipedia.org/wiki/Iverson_bracket">Iverson
brackets</a>. The exponential
generating function $D(x)$ is now</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
D(x) &= \sum_{n \geq 1} \frac{(n - 1)!}{n!} x^n [k \backslash n] \\
&= \sum_{m \geq 1} \frac{x^{mk}}{mk} \\
&= -\frac{1}{k} \log(1 - x^k).
\end{align*} %]]></script>
<p>The sequence $h_n$ now counts the number of permutations on $n$ letters which
only have cycles with lengths divisible by $k$. The exponential formula gives
us the same result as before, namely</p>
<script type="math/tex; mode=display">H(x) = (1 - x^k)^{-1/k}.</script>
<p>If we knew what generating function the right-hand side was, then we could
equate coefficients and be done. In accordance with the <a href="https://en.wikipedia.org/wiki/Binomial_series">binomial
theorem</a>, the sequence $a_m =
{\alpha \choose m}$ is generated by $(1 + x)^\alpha$. We can use this to write
$(1 - x^k)^{-1/k}$ as a generating function.</p>
<p>If $(-1)^{1/k}$ is any $k$th root of $(-1)$, then</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
(1 - x^k)^{-1/k} &= (1 + ((-1)^{1/k} x)^k)^{-1/k} \\
&= \sum_m {-1/k \choose m} (-1)^m x^{mk}.
\end{align*} %]]></script>
<p>So we have</p>
<script type="math/tex; mode=display">H(x) = \sum_m {-1/k \choose m} (-1)^m x^{mk}.</script>
<p>The coefficient on $x^n / n!$ on the left-hand side is $h_n$, the number that
we want to find. The coefficient on $x^n / n!$ on the right-hand side is</p>
<script type="math/tex; mode=display">n! (-1)^m {-1/k \choose m} [mk = n] = n! (-1)^{n / k} {-1/k \choose n / k}.</script>
<p>Since there are $n!$ permutations on $n$ letters total, this tells us that the
proportion of them containing only cycles of length divisible by $k$ is</p>
<script type="math/tex; mode=display">\frac{d_n}{n!} = (-1)^{n / k} {-1/k \choose n / k}.</script>
<p>The right-side looks daunting, but it simplifies a lot. First, let’s let $n / k
= m$ again, to make things look nicer. Using the falling factorial definition
of the binomial coefficient we get</p>
<script type="math/tex; mode=display">(-1)^m {-1/k \choose m} = (-1)^m \frac{(-1/k)^{\underline{m}}}{m!}.</script>
<p>It isn’t hard to check that $(-1)^m(-x)^{\underline{m}} = x^{\overline{m}}$ for
all $x$ and $m$, therefore our expression reduces to</p>
<script type="math/tex; mode=display">\frac{(1/k)^{\overline{m}}}{m!} = \frac{(1/k)^{\overline{n/k}}}{(n/k)!}.</script>
<p>Phew! That was a lot of work. We should at least try our result out before
continuing. With $n = 4$ and $k = 2$, this says that exactly $9$ permutations
of the $24$ on $4$ letters have only even cycles. These are:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
&(0, 1, 2, 3) \\
&(0, 1, 3, 2) \\
&(0, 1)(2, 3) \\
&(0, 2, 1, 3) \\
&(0, 2, 3, 1) \\
&(0, 2)(1, 3) \\
&(0, 3, 1, 2) \\
&(0, 3, 2, 1) \\
&(0, 3)(1, 2).
\end{align*} %]]></script>
<p>With $n = 10$ and $k = 2$, this says that roughly 24.6% of all
$10!$ permutations on ten letters have only even cycles.</p>
<h1 id="a-real-proof">A <em>real</em> proof</h1>
<p>I just can’t believe this result without some computational evidence.
Fortunately, this is easy to provide. Here is some Python code to do exactly
that:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">sympy.combinatorics.permutations</span> <span class="kn">import</span> <span class="n">Permutation</span>
<span class="kn">from</span> <span class="nn">sympy</span> <span class="kn">import</span> <span class="n">rf</span><span class="p">,</span> <span class="n">factorial</span><span class="p">,</span> <span class="n">S</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="k">def</span> <span class="nf">cycles_divisible</span><span class="p">(</span><span class="n">perm</span><span class="p">,</span> <span class="n">k</span><span class="p">):</span>
<span class="s">"""Check if every cycle in perm has length divisible by k."""</span>
<span class="n">cycles</span> <span class="o">=</span> <span class="n">perm</span><span class="o">.</span><span class="n">cycle_structure</span>
<span class="k">return</span> <span class="nb">all</span><span class="p">(</span><span class="n">length</span> <span class="o">%</span> <span class="n">k</span> <span class="o">==</span> <span class="mi">0</span> <span class="k">for</span> <span class="n">length</span> <span class="ow">in</span> <span class="n">cycles</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span>
<span class="k">def</span> <span class="nf">expected_proportion</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">k</span><span class="p">):</span>
<span class="n">m</span> <span class="o">=</span> <span class="n">n</span> <span class="o">//</span> <span class="n">k</span>
<span class="k">return</span> <span class="n">rf</span><span class="p">(</span><span class="n">S</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">k</span><span class="p">,</span> <span class="n">m</span><span class="p">)</span> <span class="o">/</span> <span class="n">factorial</span><span class="p">(</span><span class="n">m</span><span class="p">)</span>
<span class="n">n</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">sample_size</span> <span class="o">=</span> <span class="mi">2000</span>
<span class="n">k</span> <span class="o">=</span> <span class="mi">2</span>
<span class="n">samples</span> <span class="o">=</span> <span class="p">[</span><span class="n">cycles_divisible</span><span class="p">(</span><span class="n">Permutation</span><span class="o">.</span><span class="n">random</span><span class="p">(</span><span class="n">n</span><span class="p">),</span> <span class="n">k</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">sample_size</span><span class="p">)]</span>
<span class="n">proportions</span> <span class="o">=</span> <span class="p">[</span><span class="nb">sum</span><span class="p">(</span><span class="n">samples</span><span class="p">[:</span><span class="n">m</span><span class="p">])</span> <span class="o">/</span> <span class="n">m</span> <span class="k">for</span> <span class="n">m</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">))]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">style</span><span class="o">.</span><span class="n">use</span><span class="p">(</span><span class="s">"ggplot"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">axhline</span><span class="p">(</span><span class="n">expected_proportion</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">k</span><span class="p">),</span> <span class="n">ls</span><span class="o">=</span><span class="s">"--"</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"black"</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"Expected proportion"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">proportions</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlim</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">sample_size</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">()</span>
<span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s">"Proportion of permutations with even cycles"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s">"Sample size"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s">"Proportion"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>
<p>This snippet, ran three times, generated the following plot:</p>
<p><img src="/files/cycles.svg" alt="Graph showing three "random" lines converging to our expected
value." /></p>
<p>Amazing! The proportions are converging to what we expect, which is a wonderful
thing to see. In some ways explaining this picture is the best part of the
above proof. Why slog through such computations if you aren’t explaining
something surprising?</p>
<p>Anyway, none of this is original. It’s all from exercise 2 in Chapter 3 of
Wilf’s <em>generatingfunctionology</em>, which I highly recommend.</p>Robert Dougherty-Blissrobert.w.bliss@gmail.comAccording to some generating function magic, the proportion of permutations on $n$ letters which have cycles only of lengths divisible by a fixed integer $k$ is where the overline means “rising factorial”. I will prove this, but then I will actually demonstrate that it is true, using computational methods. Empirical verification carries a certainty that a “mere proof” cannot!The Oglethorpe Bowls2019-05-27T00:00:00-04:002019-05-27T00:00:00-04:00https://rwbogl.github.io/oglethorpe-commencement-awards<p>At Oglethorpe’s 2019 commencement I was presented with both the Sally Hull
Weltner Award for Scholarship and the James Edward Oglethorpe Award. The Sally
Weltner award is a very nice bowl which “honors the student in the graduating
class who has attained the highest grade-point average with the greatest number
of hours in course work completed at the university.” The James Edward
Oglethorpe award is “presented annually to two individuals in the graduating
class who, in the opinion of the faculty, have excelled in scholarship,
leadership and character.”</p>
<p>I was honored to share the Sally Weltner award with my fellow Honors student
and friend Gillian Rabin. (Ties are unusual for the Sally Weltner award, but
happily not this year!) I have known Gillian since our shared Scholarship
Weekend with the theater department four years ago. She has been a remarkable
student and performer over the years, and it was wonderful to take the stage
with her.</p>
<p>I was also honored to share the James Edward Oglethorpe Award with my good
friend Brad Firchow. I cannot think of any individual who exemplifies
leadership and character the way Brad does. To be frank, he is clearly carrying
the weight of the award in those two categories this year. Brad had a far
larger impact at Oglethorpe than I did, and he accomplished this by being a
stupendous leader. He was the best person to receive the award.</p>
<p>When I explained to my family what the James Edward Oglethorpe Award was, my
dad called it the “teacher’s pet award.” There is some truth to this. I
certainly would not have won the award without the votes of our beloved
Oglethorpe faculty. Their recognition means far more than any diploma I
obtained that day.</p>
<p>I am not quite finished with academia. I am now a graduate student in the
mathematics department at Rutgers University. Even being so far away, I will
surely return to Oglethorpe from time to time in person and on this blog if
anything notable happens.</p>Robert Dougherty-Blissrobert.w.bliss@gmail.comAt Oglethorpe’s 2019 commencement I was presented with both the Sally Hull Weltner Award for Scholarship and the James Edward Oglethorpe Award. The Sally Weltner award is a very nice bowl which “honors the student in the graduating class who has attained the highest grade-point average with the greatest number of hours in course work completed at the university.” The James Edward Oglethorpe award is “presented annually to two individuals in the graduating class who, in the opinion of the faculty, have excelled in scholarship, leadership and character.”Matthew sequences2019-05-14T00:00:00-04:002019-05-14T00:00:00-04:00https://rwbogl.github.io/matthew-sequences<p>$\renewcommand{\floor}[1]{\left\lfloor #1 \right\rfloor}$</p>
<p>I was recently shown a fun problem by Matthew, my younger brother in ninth
grade. It goes like this:</p>
<blockquote>
<p>What is the probability of answering this question correctly by guessing?</p>
<ol>
<li>60%</li>
<li>25%</li>
<li>25%</li>
<li>30%</li>
</ol>
</blockquote>
<p>The answer is that there is no answer. The probability of choosing a correct
answer by guessing is</p>
<script type="math/tex; mode=display">P = \frac{\text{# of ways to choose correct answer}}{4}.</script>
<p>None of the answers equal the $P$ value that they generate, so none of the
answers work.</p>
<p>This problem sounds silly at first, but has some fascinating ideas in it. It is
essentially asking us this: Given a sequence of numbers, what numbers are equal
to how often they occur in the sequence? The following is my normalized
reframing of it.</p>
<p><strong>Definition.</strong> A <em>Matthew sequence</em> is a finite sequence of reals in $(0, 1]$
that sum to $1$. For such a sequence $S$ consisting of terms $a_1, \dots, a_n$,
let $f_S(x)$ be the proportion of elements of $S$ that equal $x$. That is,</p>
<script type="math/tex; mode=display">f_S(x) = \frac{\sum_{k = 1}^n [x = a_k]}{n}.</script>
<p>A <em>fixed point</em> of $S$ is a fixed point of $f_S$ in $S$. That is, it is a real
$x \in S$ such that</p>
<script type="math/tex; mode=display">x = \frac{\sum_{k = 1}^n [x = a_k]}{n}.</script>
<p>The fixed point $1$ of the Matthew sequence $[1]$ is called <em>trivial</em>.</p>
<p>In this language, the problem is asking us to find the fixed point of a given
finite sequence.</p>
<p>Examples:</p>
<ul>
<li>$1$ is a fixed point of $[1]$.</li>
<li>$1/3$ is a fixed point of $S = [1 / 2, 1 / 3, 1 / 6]$.</li>
<li>$2 / 5$ is a fixed point of $[2 / 5, 2 / 5, 1/15, 1/15, 1/15]$.</li>
<li>The sequence $[1/4, 1/4, 1/3, 1/6]$ has no fixed point.</li>
</ul>
<p>I think that there are a few natural questions:</p>
<ol>
<li>What numbers can occur as fixed points of a Matthew sequence?</li>
<li>How many fixed points can occur in a Matthew sequence of fixed length?</li>
<li>Do there exist Matthew sequences with arbitrarily many fixed points?</li>
<li>If we know a particular fixed point, can we say anything about the size of
the sequence it comes from?</li>
</ol>
<p>In this post, I shall answer the first three questions.</p>
<h1 id="classifying-fixed-points">Classifying fixed points</h1>
<p>Not every number can be a fixed point of a Matthew sequence. Fixed points must
be rational, for instance, but we can get even harsher restrictions. For
example, it is impossible for $\tfrac{1}{2}$ to be a fixed point of any Matthew
sequence. For it to be a fixed point it would need to make up exactly
$\tfrac{1}{2}$ of the elements of the sequence. However, since the sequence
must sum to $1$, this means that the sequence must be $[1/2, 1/2]$, of which
$1/2$ is <em>not</em> a fixed point. This argument generalizes to numbers greater than
$\tfrac{1}{2}$ as well.</p>
<p><strong>Lemma.</strong> Every nontrivial fixed point of a Matthew sequence is strictly less
than $1/2$.</p>
<p><strong>Proof.</strong> Suppose that a Matthew sequence $S$ had a fixed point $p \geq 1/2$.
Since $2p \geq 1$, the multiplicity of $p$ can be at most $2$. If it is exactly
$2$, then we have $2p \leq 1$ by the sum condition, which them implies $p
= 1/2$. We must conclude that $S = [1/2, 1/2]$, but then $p = 1/2$ is not
a fixed point.</p>
<p>Suppose that there is exactly a single $p$ present in $S$. For $p$ to be
a fixed point, we must have $p = 1 / n$ where $n$ is the size of our Matthew
sequence. Since $p \geq 1 / 2$, this implies $n \leq 2$, so our sequence is
either $[p]$ or $[p, 1 - p]$. The first case is possible only when $p$ is
nontrivial, so we must be in the second case. But then $p$ could only be
a fixed point if $p = 1/2$, which produces the sequence $[1/2, 1/2]$. Therefore
$p$ cannot be a fixed point. $\blacksquare$</p>
<p><strong>Proposition.</strong> If $\tfrac{a}{b}$ is a nontrivial fixed point of a Matthew
sequence, then $1 \leq a^2 < b$ and $b > 2$.</p>
<p><strong>Proof.</strong> Suppose that $a / b$ is nontrivial fixed point with multiplicity $m$
in a Matthew sequence of length $n$. Since $a / b$ is a fixed point, it follows
that $an = bm$. We can assume that $a$ and $b$ are coprime, hence $a$ divides
$m$, meaning that $a \leq m$. By the sum condition we have $a \leq m \leq
\floor{b / a}$, so $a \leq \floor{b / a}$. If $b / a$ is an integer, then $a
= 1$ since $a$ and $b$ are coprime. The previous lemma then implies $b > 2$,
which gives $a^2 < b$. If $b / a$ is not an integer then we immediately obtain
$a^2 < b$. $\blacksquare$</p>
<p><strong>Proposition.</strong>
If $a / b$ is a rational such that $1 \leq a^2 < b$ and $b > 2$, then $a / b$
is a fixed point of some Matthew sequence.</p>
<p><strong>Proof.</strong> We will construct a Matthew sequence with $b$ elements. Begin by
placing $a$ copies of $a / b$ into $S$. This gives the partial sum $a^2 / b$,
which is strictly less than $1$. The remaining sum to recover is $1 - a^2 / b$.
If $a \neq 1$, then</p>
<script type="math/tex; mode=display">\frac{1 - a^2 / b}{b - a} \neq \frac{a}{b}.</script>
<p>Thus, we may add $b - a$ copies of</p>
<script type="math/tex; mode=display">\frac{1 - a^2 / b}{b - a}</script>
<p>to $S$. If $a = 1$, then instead add $1/2 - 1/2b$ and $b - 2$ copies of</p>
<script type="math/tex; mode=display">\frac{1 - \frac{1}{b} - \frac{1}{2} + \frac{1}{2b}}{b - 2}.</script>
<p>This only equals $1 / b$ or $1 / 2 - 1 / 2b$ for non-integer values of $b$. In
both cases, $a / b$ is in fact a fixed point. $\blacksquare$</p>
<p><strong>Theorem.</strong> The rational $a / b$ is a nontrivial fixed point of some Matthew
sequence iff $1 \leq a^2 < b$ and $b > 2$.</p>
<p><strong>Proof.</strong> Combine the two previous propositions.</p>
<h1 id="counting-fixed-points">Counting fixed points</h1>
<p><strong>Theorem.</strong> Given a positive integer $n$, there exists a Matthew sequence of
length $O(n^3)$ with at least $n$ fixed points.</p>
<p><strong>Proof.</strong> Let $N$ be a really big, to-be-determined number. Construct $S$ by
adding a single $1 / N$, two $2 / N$’s, three $3 / N$’s, and so on; each step
of this construction produces a new fixed point if $S$ ends with $N$ elements.
We may thus produce $n$ fixed points so long as</p>
<script type="math/tex; mode=display">\sum_{k = 1}^n k \frac{k}{N} \leq 1.</script>
<p>That is, so long as</p>
<script type="math/tex; mode=display">n(n + 1)(2n + 1) \leq 6N.</script>
<p>The left hand side is $O(n^3)$, so taking $N = cn^3$ for a suitable
constant $c$ will produce $n$ fixed points in $S$.</p>
<p>To finish the construction we need to ensure that $S$ sums to $1$ and has $N$
elements without ruining too many of the previous fixed points. When done with
our initial construction, we will have added $S_n = n(n + 1) / 2$ elements. If
we let $p$ be the sum of the first $S_n$ elements, then we could add $N - S_n$
copies of</p>
<script type="math/tex; mode=display">\frac{1 - p}{N - S_n}</script>
<p>to make $S$ sum to $1$. This removes at most one fixed point, so we have at
least $n - 1$ fixed points. If we carry out this process from the beginning
with $n + 1$ instead of $n$, then we will still obtain a sequence of length
$O(n^3)$ with at least $n$ fixed points. $\blacksquare$</p>
<p><strong>Observation.</strong> A fixed point of a Matthew sequence is determined by its
multiplicity. That is, if $x$ is a fixed point in a sequence of length $n$ that
occurs $m$ times, then $x = m / n$. Note that then $x$ contributes $m \cdot
m / n = m^2 / n$ to the sum of $S$.</p>
<p><strong>Proposition.</strong> A fixed point in a Matthew sequence of length $n$ has
multiplicity not exceeding $\sqrt{n}$.</p>
<p><strong>Proof.</strong> Let $x$ be a fixed point of the Matthew sequence $S$ of length $n$
with multiplicity $m$. Then $x = m / n$, and $x$ contributes $m^2 / n$ to the
sum of $S$. By the sum condition on $S$,</p>
<script type="math/tex; mode=display">\frac{m^2}{n} \leq = 1.</script>
<p>Therefore $m \leq \sqrt{n}$. $\blacksquare$</p>
<p><strong>Theorem.</strong> The number of fixed points in a Matthew sequence of length $n$
does not exceed $O(n^{1/3})$.</p>
<p><strong>Proof.</strong> Suppose that there are exactly $r$ fixed points in $S$. Since fixed
points are determined by their multiplicities, each must have a distinct
multiplicity as well, call them $m_1, m_2, \dots, m_r$. The fixed points
contribute exactly</p>
<script type="math/tex; mode=display">\sum_{k = 1}^r \frac{m_k^2}{n}</script>
<p>to the sum of $S$, therefore</p>
<script type="math/tex; mode=display">\sum_{k = 1}^r \frac{m_k^2}{n} \leq 1</script>
<p>However, since the multiplicities are distinct integers in $[1, n]$, the
smallest that the sum on the left could be occurs when $m_k = k$. This yields</p>
<script type="math/tex; mode=display">\sum_{k = 1}^r \frac{k^2}{n} \leq 1,</script>
<p>which gives, approximately,</p>
<script type="math/tex; mode=display">\frac{r^3}{6n} \leq 1.</script>
<p>Therefore $r \leq (6n)^{1/3}$. $\blacksquare$</p>
<h1 id="required-length-of-matthew-sequences">Required Length of Matthew sequences</h1>
<p>If $a / b$ is a fixed point of some Matthew sequence, we know that the sequence
length must be a multiple of $b$. Can it be anything other than $b$ itself?</p>
<p>As an example, consider $4 / 17$. By one of our previous theorems, we know that
$4 / 17$ must be a fixed point of <em>some</em> Matthew sequence. How long is this
sequence? If $4 / 17$ has multiplicity $m$ in a sequence of length $n$, then</p>
<script type="math/tex; mode=display">n = m \frac{4}{17},</script>
<p>$17$ divides $n$, and $m \leq \floor{17 / 4}$. These all give us the inequality</p>
<script type="math/tex; mode=display">17 \leq n \leq \left( \frac{17}{4} \right)^2 = 18.0625.</script>
<p>Since $n$ is a multiple of $17$, it follows that $n = 17$.</p>
<p>This argument works because $17$ is not much bigger than $4^2 = 16$. A similar
argument under the same hypotheses gives a partial result.</p>
<p><strong>Theorem.</strong> If $a / b$ is a fixed point of a Matthew sequence $S$ such that
$a$ and $b$ are coprime and $b - a^2 < a$, then $|S| = b$.</p>
<p><strong>Proof.</strong> Let $n = |S|$ and $m$ be the multiplicity of $a / b$. By definition,
we have</p>
<script type="math/tex; mode=display">n = m \frac{b}{a}.</script>
<p>Since $m \leq \floor{b / a}$ and $b$ divides $n$, we have</p>
<script type="math/tex; mode=display">b \leq n \leq \left( \frac{b}{a} \right)^2.</script>
<p>By a previous lemma, for $a / b$ to be a fixed point we must have $1 \leq a^2
\leq b$. If we write $b = a^2 + k$ for some nonnegative integer $k$, then our
inequality becomes</p>
<script type="math/tex; mode=display">b \leq n \leq b + k + \frac{k^2}{a^2}.</script>
<p>Since $k = b - a^2 < a$ and $n$ is an integer, the inequality reduces to</p>
<script type="math/tex; mode=display">b \leq n \leq b + k.</script>
<p>Since $k = b - a^2 < b$, the only multiple of $b$ satisfying this inequality is
$b$ itself. Since $n$ is a multiple of $b$, it follows that $n = b$.
$\blacksquare$</p>
<p>In general, this question remains open. We could probably think a little harder
with our number theory brains to find a better answer, but this seems like
a good place to stop for now.</p>Robert Dougherty-Blissrobert.w.bliss@gmail.com$\renewcommand{\floor}[1]{\left\lfloor #1 \right\rfloor}$Two perspectives on the Möbius transform2019-05-08T00:00:00-04:002019-05-08T00:00:00-04:00https://rwbogl.github.io/two-perspectives-on-the-mobius-transform<p>The Möbius function $\mu$ is special because of its inversion principle:</p>
<script type="math/tex; mode=display">g(n) = \sum_{d \backslash n} f(d) \iff f(n) = \sum_{d \backslash n} \mu(d) g(n / d).</script>
<p>That is, every sum over divisors can be uniquely inverted. This seems pretty
surprising to me. I want to discuss two ways that we can think of the Möbius
function. First as a function dreamed up exactly to give us that inversion
identity, and then as a natural object to study in generating functions.</p>
<h1 id="recurrences">Recurrences</h1>
<p>It is easy to retrospectively conjure up the Möbius function. Suppose that we
want to show that we can invert a sum over divisors. That is, we want to turn</p>
<script type="math/tex; mode=display">g(n) = \sum_{d \backslash n} f(d)</script>
<p>into something like</p>
<script type="math/tex; mode=display">f(n) = \sum_{d \backslash n} q(n / d) g(d)</script>
<p>for some suitable $q$. (This exact form seems like a questionable step, but
could be motivated. It’s something like the convolution of two sequences,
tailored for number theory.) We would begin by looking at that inner sum:</p>
<script type="math/tex; mode=display">\sum_{d \backslash n} q(n / d) g(d) = \sum_{d \backslash n} \sum_{k \backslash d} q(n / d) f(k).</script>
<p>Interchanging the order of summation turns this into</p>
<script type="math/tex; mode=display">\sum_{k \backslash n} \sum_{d \backslash n / k} q(n / kd) f(k)
= \sum_{k \backslash n} f(k) \sum_{d \backslash n / k} q(n / kd).
= \sum_{k \backslash n} f(k) \sum_{d \backslash n / k} q(d).</script>
<p>We would <em>really</em> like for that inner sum to equal $[n = k]$, or $[n / k = 1]$.
That is, we would like for $q(d)$ to satisfy the identity</p>
<script type="math/tex; mode=display">\sum_{d \backslash m} q(d) = [m = 1]</script>
<p>for all nonnegative integers $m$. This is precisely the definition of the
Möbius function, so $q(d) = \mu(d)$. The proof ends with</p>
<script type="math/tex; mode=display">\sum_{k \backslash n} f(k) \sum_{d \backslash n / k} \mu(d)
= \sum_{k \backslash n} f(k) [n / k = 1]
= f(n).</script>
<p>So, armed with a little bit of foresight (or hindsight), it isn’t <em>too</em> hard to
guess what the Möbius function should be.</p>
<h1 id="generating-functions">Generating functions</h1>
<p>The previous derivation of the Möbius function still requires some ingenuity.
You need to conjecture that the inversion formula will take a certain form,
then know enough about interchanging sums over divisors so that everything
becomes clear. There is another way to discover the Möbius function that is
entirely ingenuity-free: With generating functions.</p>
<p>The two usual types of generating functions are the <em>exponential</em> and
<em>ordinary</em> kind. However, there are others. The one useful for us is the
<em>Dirichlet</em> generating function (dgf) of a sequence, defined as</p>
<script type="math/tex; mode=display">D(s) = \sum_{n \geq 1} \frac{a_n}{n^s}</script>
<p>for a sequence ${a_n}$. As an example, the Riemann zeta function is just the
Dirichlet generating function of the sequence $a_n \equiv 1$:</p>
<script type="math/tex; mode=display">\zeta(s) = \sum_{n \geq 1} \frac{1}{n^s}.</script>
<p>Like other generating functions, dgfs satisfy a useful multiplication rule. If
$F(s)$ and $G(s)$ generate ${a_n}$ and ${b_n}$, respectively, then $F(s)
G(s)$ generates</p>
<script type="math/tex; mode=display">\sum_{d \backslash n} a_d b_{n / d}.</script>
<p>This lets us formulate the inversion principle in terms of dgfs. The equation</p>
<script type="math/tex; mode=display">g(n) = \sum_{d \backslash n} f(d)</script>
<p>is equivalent to saying that</p>
<script type="math/tex; mode=display">G(s) = F(s) \zeta(s),</script>
<p>where $G$ and $F$ are the dgfs of $g(n)$ and $f(n)$. Simply multiplying by
$\zeta^{-1}(s)$ gives</p>
<script type="math/tex; mode=display">F(s) = G(s) \zeta^{-1}(s).</script>
<p>Equating coefficients gives us a relationship between $f(n)$ and $g(n)$. All
that remains is to find the coefficients of $\zeta^{-1}(s)$. Suppose that</p>
<script type="math/tex; mode=display">\zeta^{-1}(s) = \sum_{n \geq 1} \frac{z_n}{n^s}.</script>
<p>Then, the equation $\zeta(s) \zeta^{-1}(s) = 1$ is equivalent to saying that</p>
<script type="math/tex; mode=display">\sum_{d \backslash n} z_d = [n = 1].</script>
<p>This is exactly the definition of the Möbius function. That is, $z_n = \mu(n)$.
From this perspective, the Möbius transform is a trivial consequence of the
relation</p>
<script type="math/tex; mode=display">G(s) = F(s) \zeta(s) \iff F(s) = G(s) \zeta^{-1}(s).</script>
<p>In some sense, the Möbius inversion is the simplest possible thing that we
could prove here. The dgf $\zeta(s)$ is the simplest generating function that
we could imagine, and the Möbius inversion formula just comes from looking at
its inverse. We could easily discover other, more complicated identities by
considering more complicated generating functions.</p>
<p>For example, the dgf of the sequence $a_n = n$ is $\zeta(s - 1)$. Right away,
we can generate the sum of divisors sequence with $\zeta(s) \zeta(s - 1)$:</p>
<script type="math/tex; mode=display">\zeta(s) \zeta(s - 1) = 1 + \frac{3}{2^s} + \frac{4}{3^s} + \frac{7}{4^s} + \cdots</script>
<p>However, this seems like a topic to explore another day.</p>Robert Dougherty-Blissrobert.w.bliss@gmail.comThe Möbius function $\mu$ is special because of its inversion principle:Gosper’s algorithm discovers integral identity2019-03-01T00:00:00-05:002019-03-01T00:00:00-05:00https://rwbogl.github.io/gosper-s-algorithm-discovers-integral-identity<p>While playing around with <a href="https://en.wikipedia.org/wiki/Gosper%27s_algorithm">Gosper’s
algorithm</a>, I discovered
a very neat identity:</p>
<script type="math/tex; mode=display">\begin{equation}
\sum_{k = 0}^n\frac{\binom{2k}{k}}{4^k} = \frac{2n + 1}{\pi} \int_{-\infty}^\infty \frac{x^{2n}}{(x^2 + 1)^{n + 1}}\ dx.
\end{equation}</script>
<p>I’d like to show how to derive it.</p>
<p>The identity is not as deep as it first seems. It actually comes from linking
two smaller identities together:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align}
\sum_{k = 0}^n \frac{\binom{2k}{k}}{4^k} &= \frac{2n + 1}{4^n} \binom{2n}{n} \\
\int_{-\infty}^\infty \frac{x^{2n}}{(x^2 + 1)^{n + 1}}\ dx &= \frac{\pi}{4^n} \binom{2n}{n}.
\end{align} %]]></script>
<p>The first is entirely routine:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">In</span> <span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="kn">import</span> <span class="nn">sympy</span> <span class="k">as</span> <span class="n">sp</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="kn">from</span> <span class="nn">sympy.concrete.gosper</span> <span class="kn">import</span> <span class="n">gosper_sum</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">3</span><span class="p">]:</span> <span class="kn">from</span> <span class="nn">sympy.abc</span> <span class="kn">import</span> <span class="n">n</span><span class="p">,</span> <span class="n">k</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">4</span><span class="p">]:</span> <span class="n">gosper_sum</span><span class="p">(</span><span class="n">sp</span><span class="o">.</span><span class="n">binomial</span><span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="n">k</span><span class="p">,</span> <span class="n">k</span><span class="p">)</span> <span class="o">/</span> <span class="mi">4</span><span class="o">**</span><span class="n">k</span><span class="p">,</span> <span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">n</span><span class="p">))</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">4</span><span class="p">]:</span> <span class="mi">4</span><span class="o">**</span><span class="p">(</span><span class="o">-</span><span class="n">n</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="mi">2</span><span class="o">*</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">binomial</span><span class="p">(</span><span class="mi">2</span><span class="o">*</span><span class="n">n</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
</code></pre></div></div>
<p>Hooray for Gosper’s algorithm!</p>
<p>The second identity is a little more challenging. Here’s one way to prove it:
The generating function for the sequence</p>
<script type="math/tex; mode=display">I(n) = \int_{-\infty}^\infty \frac{x^{2n}}{(x^2 + 1)^{n + 1}}\ dx</script>
<p>is</p>
<script type="math/tex; mode=display">\frac{\pi}{\sqrt{1 - x}}.</script>
<p>(Exchange the sum and integral, evaluate the resulting geometric sum, then
rescale the remaining integral by $(1 - x)^{-1/2}$.) The generating function
for $\binom{2n}{n}$ is</p>
<script type="math/tex; mode=display">\frac{1}{\sqrt{1 - 4x}}.</script>
<p>Therefore, doing some rescaling and equating of coefficients,</p>
<script type="math/tex; mode=display">\frac{\pi}{4^n} \binom{2n}{n} = \int_{-\infty}^\infty \frac{x^{2n}}{(x^2 + 1)^{n + 1}}\ dx.</script>
<p>This gives us our identity!</p>
<h2 id="a-better-identity">A better identity</h2>
<p>Applying Gosper’s algorithms to sums of the form $\sum_{k = 0}^n f(k)$ really
undersells its power. When Gosper’s algorithm works, it gives much more.</p>
<p>Gosper’s algorithm really tells us that</p>
<script type="math/tex; mode=display">\Delta \frac{2n + 1}{\pi} \int_{-\infty}^\infty \frac{x^{2n}}{(x^2 + 1)^{n + 1}}\ dx = \Delta \frac{2n + 1}{4^n} \binom{2n}{n} = \frac{\binom{2n}{n}}{4^n},</script>
<p>which is a strictly better result.</p>Robert Dougherty-Blissrobert.w.bliss@gmail.comWhile playing around with Gosper’s algorithm, I discovered a very neat identity:Ergonomics in computer science2019-02-15T00:00:00-05:002019-02-15T00:00:00-05:00https://rwbogl.github.io/ergonomics-in-computer-science<p>In computer science we often care about <em>computational complexity</em>. How long
will an algorithm take to run? How will it perform on average? In essence, <em>how
fast can we go?</em> Though this is an important consideration, it omits a crucial
implementation detail: the human factor. Is the algorithm painful or tedious to
write? Is it overly complicated for the average programmer? That is, <em>is it
ergonomic to use?</em> It is easy to demand that performance trumps all, but this
is a costly mistake. Tools and environments must account for human factors;
programming languages are no exception.</p>
<p>Ergonomics is the study of human relationship with work. It seeks to make
necessary burdens easier and more enjoyable. This goal is based upon the
observation that “long faces are not always efficient, nor are smiling ones
idle”<sup id="fnref:hancock"><a href="#fn:hancock" class="footnote">1</a></sup>. Ergonomics considers physical and psychological <em>human
factors</em>, such as comfort and stress, respectively. It may, for example,
suggest appropriate levels and types of lighting for the workplace to improve
morale, or recommend chairs with a certain amount of back support to avoid
long-term injury. This is all to improve the human condition and workplace
efficiency.</p>
<p>Ergonomics is not confined to study physical factors. Beginning in the 1970s,
researchers in ergonomics began to study <em>mental workload</em>. Roughly, this is
how mentally taxing certain tasks are. If a worker’s mental workload is too
high, they are likely to make mistakes or “burnout” faster than a relaxed
employee. The following is a more technical definition:</p>
<blockquote>
<p>[Mental workload is] the relation between the function relating the mental
resources demanded by a task and those resources available to be supplied by
the human operator<sup id="fnref:parasurman"><a href="#fn:parasurman" class="footnote">2</a></sup>.</p>
</blockquote>
<p>This problem is not constrained to office workers. Two studies in aviation
accidents found that as much as 18% of pilot errors were due to confusing
instrument design that made it difficult for pilots to understand their
readouts<sup id="fnref:handbookhuman"><a href="#fn:handbookhuman" class="footnote">3</a></sup>.</p>
<p>This is all to say that the tools we use and the tasks we complete should be
easy to understand. We should not be satisfied that clear design happens by
accident; we should deliberately strive for it. The consequences of ignoring
this can range from decreased worker productivity and longevity, to grave,
avoidable mistakes.</p>
<p>Consideration of mental workload is especially important in programming
language design. Programming, more than other activities, is centered around
thought. Its primary tool, the programming language, is a means to express
computational thought in a way that the computer can understand. The task of
the programmer is to mentally construct a solution to a problem, then translate
this mental solution into a concrete programming language<sup id="fnref:1"><a href="#fn:1" class="footnote">4</a></sup>. It is this
translation step that increases mental workload.</p>
<p>As an example, consider a student beginning to learn programming. They must
learn the mantra that computers are “stupid,” and will only do exactly as they
are told, and no more. They must learn to translate their mental solutions into
mechanical steps. Along the way they learn how to think in terms of this
translation. The successful student will overcome this initial hurdle, but the
mental workload of translation is always present. The mental workload that
remains is largely a function of the programming language a programmer uses.</p>
<p>In the context of software, I call the contributions to this mental workload
<em>expressive complexity</em>, in opposition to traditional <em>computational
complexity</em>. Expressive complexity, then, measures how complicated algorithms
are to implement, how difficult a language is to use, and how much mental
strain is imposed on a programmer by these objects.</p>
<h1 id="examples">Examples</h1>
<p>Consider the following task: Sum the integers from 1 to 100. Here are three
solutions:</p>
<h3 id="haskell"><a href="https://www.haskell.org/">Haskell</a></h3>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">sum</span> <span class="p">[</span><span class="mi">1</span><span class="o">..</span><span class="mi">100</span><span class="p">]</span>
</code></pre></div></div>
<h3 id="python"><a href="https://www.python.org/">Python</a></h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sum</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">101</span><span class="p">))</span>
</code></pre></div></div>
<h3 id="c"><a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a></h3>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="n">sum</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">k</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">k</span> <span class="o"><=</span> <span class="mi">100</span><span class="p">;</span> <span class="n">k</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="n">sum</span> <span class="o">+=</span> <span class="n">k</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>All three solutions have the same computational complexity. However, they
clearly differ in their <em>expressive complexity</em>. The Haskell and Python
solutions are almost exact 1-1 translations of the obvious solution: Just sum
the integers from 1 to 100. In particular, the programmer does not have to
think about <em>explicit iteration</em>, which is how computers think. Instead, they
can essentially write down their mental solution.</p>
<p>In comparison, the C solution is very mechanical. It shows how the <em>computer</em>
thinks of the process of summing integers, rather than how the <em>programmer</em>
thinks of it. A separate <code class="language-plaintext highlighter-rouge">sum</code> variable must be accounted for, because
computers must do such things, not because humans must do such things.</p>
<p>This example shows that expressive complexity is not just a feature of
a particular algorithm, but rather a feature of particular <em>languages</em>. We may
thus compare languages by their expressive complexity and decide which best
suit our purpose.</p>
<p>There are a number of ways to measure expressive complexity. The earliest is
the <a href="https://en.wikipedia.org/wiki/Halstead_complexity_measures">Halstead
metrics</a>, a set of
metrics invented by Maurice Halstead to put this type of comparison on firmer
footing. These metrics include measures such as “difficulty,” “effort,” and
“vocabulary.”</p>
<h1 id="looking-forward">Looking forward</h1>
<p>My undergraduate thesis compares the expressive complexity of certain popular
computer algebra systems. I chose this topic because I am deeply interested in
the applications of computers to mathematics, and hope to develop a framework
to help choose the “best” computer system.</p>
<p>Outside of my own efforts, it seems clear that ergonomically-minded languages
are on the rise. Though languages like C and <a href="https://en.wikipedia.org/wiki/Java_(programming_language)">Java</a> are the most popular
right now, they are losing some ground. Many programming language research
groups are focusing on newer, functional languages like Haskell and
<a href="https://fsharp.org/">F#</a>. Python is becoming an increasingly popular first language for
people to learn, and has an enormous community with a great set of ergonomic
libraries. The historical trend of improvement will hopefully continue.</p>
<div class="footnotes">
<ol>
<li id="fn:hancock">
<p><a href="https://peterhancock.ucf.edu/on-the-future-of-work/">“On the Future of
Work.”</a> Peter Hancock,
1997. <a href="#fnref:hancock" class="reversefootnote">↩</a></p>
</li>
<li id="fn:parasurman">
<p><a href="http://alltvantar.com/SA%20contents/Situation%20awareness%20mental%20workload%20and%20trust%20in%20automation%20-%20Viable%20empirically%20supported%20cognitive%20engineering%20constructs.pdf">“Situation Awareness, Mental Workload, and Trust in
Automation.”</a>
Parasuraman et al., 2008. <a href="#fnref:parasurman" class="reversefootnote">↩</a></p>
</li>
<li id="fn:handbookhuman">
<p><em>Handbook of Human Factors and Ergonomics</em>, 4th edition, pg. 244. <a href="#fnref:handbookhuman" class="reversefootnote">↩</a></p>
</li>
<li id="fn:1">
<p>Of course, in actuality the lines are blurred. The programmer may have an
idea of how to solve the problem in a mechanical way, and then later build
a complete mental solution. That is, once a programmer becomes adept at
thinking “like the machine,” they can use that intuition to build
solutions. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Robert Dougherty-Blissrobert.w.bliss@gmail.comIn computer science we often care about computational complexity. How long will an algorithm take to run? How will it perform on average? In essence, how fast can we go? Though this is an important consideration, it omits a crucial implementation detail: the human factor. Is the algorithm painful or tedious to write? Is it overly complicated for the average programmer? That is, is it ergonomic to use? It is easy to demand that performance trumps all, but this is a costly mistake. Tools and environments must account for human factors; programming languages are no exception.Strengthening Fatou’s Lemma in Stein and Shakarchi2019-02-06T00:00:00-05:002019-02-06T00:00:00-05:00https://rwbogl.github.io/strengthening-fatou-s-lemma-in-stein-and-shakarchi<p>My real analysis course at Emory is using Stein and Shakarchi’s <em>Real
Analysis</em>. Concurrently I’m chewing on Rudin’s <em>Real and Complex Analysis</em>
since I really enjoyed <em>Principles of Mathematical Analysis</em>. So far, I feel
that Stein and Shakarchi complicate matters a lot for a text that puts off
general measure theory for several chapters. As an example, I want to look at
Stein and Shakarchi’s statement of Fatou’s lemma.</p>
<p>First, let’s compare the differences in presentation of the Lebesgue integral
for nonnegative functions:</p>
<p><strong>Rudin:</strong></p>
<ul>
<li>Define simple functions (generalization of step functions)</li>
<li>Define integral for simple functions</li>
<li>Define integral for nonnegative functions</li>
</ul>
<p><strong>Stein and Shakarchi:</strong></p>
<ul>
<li>Define simple functions (generalization of step functions)</li>
<li>Define integral for simple functions</li>
<li>Define integral for bounded functions supported on a set of finite measure</li>
<li>Define integral for nonnegative functions</li>
</ul>
<p>The Stein and Shakarchi step of “bounded functions supported on a set of finite
measure” seems like a <em>really</em> complicated detour to make. Why do that? It
doesn’t get us any stronger results, and in fact, it makes some results even
harder to get!</p>
<p>Take, for example, Fatou’s Lemma. The general statement is this:</p>
<blockquote>
<p>If $f_n$ is a sequence of nonnegative measurable functions, then</p>
<script type="math/tex; mode=display">\int \liminf f_n \leq \liminf \int f_n.</script>
</blockquote>
<p>In Stein and Shakarchi, the statement is this:</p>
<blockquote>
<p>If $f_n$ is a sequence of nonnegative measurable functions on $\mathbb{R}^n$
that converge to the function $f$, then</p>
<script type="math/tex; mode=display">\int f \leq \liminf \int f_n.</script>
</blockquote>
<p>To prove this, Stein and Shakarchi go out of their way to prove a “bounded
convergence theorem” for bounded functions with finite support. After all the
time spent introducing supports and proving the bounded convergence theorem,
they produce a proof about the same length as Rudin’s, but not in a general
measure space, plus an extra hypothesis and a weaker conclusion. Sad!</p>
<p>We can improve Stein and Shakarchi’s result by applying their result to
$\liminf f_n$. That is, set $g_n = \inf_{k \geq n} f_k$ and $g = \lim g_n
= \liminf f_n$. Applying their result to $g$ and $g_n$, we obtain</p>
<script type="math/tex; mode=display">\int \liminf f_n \leq \liminf \int g_n \leq \liminf \int f_n,</script>
<p>where the last inequality comes from $g_n \leq f_n$.</p>
<p>This <em>still</em> doesn’t get us what we want in an arbitrary measure space. Stein
and Shakarchi do, however, move to general measure spaces several chapters
after their development of the Lebesgue integral. There is likely some
pedagogical wisdom in this. It’s all build up to the dominated convergence
theorem and company anyway, I suppose.</p>Robert Dougherty-Blissrobert.w.bliss@gmail.comMy real analysis course at Emory is using Stein and Shakarchi’s Real Analysis. Concurrently I’m chewing on Rudin’s Real and Complex Analysis since I really enjoyed Principles of Mathematical Analysis. So far, I feel that Stein and Shakarchi complicate matters a lot for a text that puts off general measure theory for several chapters. As an example, I want to look at Stein and Shakarchi’s statement of Fatou’s lemma.Boundary conditions in a difference table2019-01-21T00:00:00-05:002019-01-21T00:00:00-05:00https://rwbogl.github.io/solving-a-recurrence-with-boundary-conditions-on-difference-table<p>While going over some algebra exercises with a computer, I became interested in
a particular sequence. The sequence begins $8$, $72$, $648$. (It is the number
of units in the Gaussian integers modulo $3^k$.) I started playing around with
it and noticed a curious property of its difference table: the first entry of
every row was a power of $2$. The sequence itself ended up being fairly simple
to guess—it’s just $9^k \cdot 8$—but my experiments made me wonder what
a proof would look like if we just knew that first entry. There is a fairly
well-known way to do this via the binomial transform, which I have joyfully
rediscovered.</p>
<p>To be more precise, the difference table looked like this:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{bmatrix}
8 & & 72 & & 648 \\
& 64 & & 576 & \\
& & 512 & &
\end{bmatrix}. %]]></script>
<p>It looks like the first term in every row will be $2^{3(k + 1)}$. I wanted to
find a formula for the top row, knowing this fact.</p>
<p>More formally, suppose that we have some function $f(n, k)$ defined on
nonnegative integers by the equations
<script type="math/tex">% <![CDATA[
\begin{align}
f(n + 1, k + 1) &= f(n, k + 1) - f(n, k) &(k \geq n \geq 0) \\
f(n, n) &= g(n) &(n \geq 0) \\
f(n, k) &= 0 &(k < n),
\end{align} %]]></script>
where $g$ is some “known” function. This says that $f$ is a difference table of
the sequence $a(k) = f(0, k)$, and that $g(n)$ is the first entry of the $n$th
row. Can we determine $f(0, k)$ in terms of $g$?</p>
<h1 id="enter-generating-functions">Enter: Generating functions</h1>
<p>Given the definition of $f$, there seem to be three natural candidates for
a generating function. They are:</p>
<ul>
<li>$A_n(x) = \sum_k f(n, k) x^k$</li>
<li>$B_k(x) = \sum_n f(n, k) y^n$</li>
<li>$C(x, y) = \sum_{k, n} f(n, k) x^k y^n$</li>
</ul>
<p>If we choose $A_n(x)$, then we are looking for the coefficients of $A_0(x)$. If
we choose $B_k(x)$, then we are looking for the constant term of $B_k(x)$. The
generating function $C(x, y)$ looks hard, so we won’t choose that.</p>
<p>For no particular reason, let’s try $A_n(x)$. The recurrence</p>
<script type="math/tex; mode=display">f(n + 1, k + 1) = f(n, k + 1) - f(n, k)</script>
<p>is valid for $n, k \geq 0$, so multiply by $x^k$ and sum over $k \geq 0$:</p>
<script type="math/tex; mode=display">\sum_{k \geq 0} f(n + 1, k + 1) x^k
= \sum_{k \geq 0} f(n, k + 1) x^k - \sum_{k \geq 0} f(n, k) x^k.</script>
<p>If we go through the motions here, we obtain</p>
<script type="math/tex; mode=display">\frac{A_{n + 1}(x) - f(n + 1, 0)}{x}
= \frac{A_n(x) - f(n, 0)}{x} - A_n(x).</script>
<p>Since we must have $n \geq 0$, by our definition of $f$ we get $f(n + 1, 0)
= 0$. If $n \geq 1$, then we also get $f(n, 0) = 0$. Thus, for $n \geq 1$, we
obtain</p>
<script type="math/tex; mode=display">A_{n + 1}(x) = (1 - x) A_n(x) \qquad (n \geq 1)</script>
<p>and for $n = 0$ we obtain</p>
<script type="math/tex; mode=display">A_1(x) = (1 - x) A_0(x) + g(0).</script>
<p>Unrolling this equation will yield</p>
<script type="math/tex; mode=display">A_{n + 1}(x) = (1 - x)^{n + 1} A_0(x) + (1 - x)^n g(0)</script>
<p>for all $n \geq 0$.</p>
<p>This looks promising, save for one slight problem: We don’t know what $A_0(x)$
is! That is exactly what we want to know, in fact. Discouraging though this may
be, we can take this one step further and uncover a neat result.</p>
<p>Our left hand side is the generating function $A_{n + 1}(x) = \sum_k f(n + 1,
k) x^k$. The coefficient on $x^{n + 1}$ is $f(n + 1, n + 1) = g(n + 1)$, so we
can probably link some terms on the right hand side to $g(n + 1)$ with this new
equality.</p>
<p>Indeed, let’s find the coefficient of $x^{n + 1}$ in the right-hand side of</p>
<script type="math/tex; mode=display">A_{n + 1}(x) = (1 - x)^{n + 1} A_0(x) + (1 - x)^n g(0).</script>
<p>The term $(1 - x)^n g(0)$ contributes nothing, so</p>
<script type="math/tex; mode=display">[x^{n + 1}] \left\{ (1 - x)^{n + 1} A_0(x) + (1 - x)^n g(0) \right\}
=
[x^{n + 1}] \left\{ (1 - x)^{n + 1} A_0(x) \right\}.</script>
<p>This term requires some work:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
(1 - x)^{n + 1} A_0(x) &= \sum_{j, k} {n + 1 \choose j} (-x)^{j} f(0, k) x^k \\
&= \sum_{j, k} {n + 1 \choose j} (-1)^j f(0, k) x^{j + k}.
\end{align*} %]]></script>
<p>Therefore,</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
[x^{n + 1}] \left\{ (1 - x)^{n + 1} A_0(x) \right\} &=
[x^{n + 1}] \sum_{j, k} {n + 1 \choose j} (-1)^j f(0, k) x^{j + k} \\
&= \sum_{j, k} [j + k = n + 1] {n + 1 \choose j} (-1)^j f(0, k) \\
&= \sum_{j = 0}^{n + 1} {n + 1 \choose j} (-1)^j f(0, n + 1 - j) \\
&= \sum_{j = 0}^{n + 1} {n + 1 \choose j} (-1)^{n + 1 - j} f(0, j).
\end{align*} %]]></script>
<p>Finally, equating the coefficients from both sides gives us the equation</p>
<script type="math/tex; mode=display">g(n + 1) = \sum_{j = 0}^{n + 1} {n + 1 \choose j} (-1)^{n + 1 - j} f(0, j),</script>
<p>valid for $n \geq 0$, or</p>
<script type="math/tex; mode=display">\begin{equation}
g(n) = \sum_{j = 0}^n {n \choose j} (-1)^{n - j} f(0, j),
\end{equation}</script>
<p>valid for $n \geq 1$.</p>
<p>This seems like an awful lot of work to get a result that is nearly the
opposite of what we want, but just you wait until the sequel!</p>
<h1 id="enter-the-binomial-transform">Enter: The binomial transform</h1>
<p>There is a wonderful theorem that runs like this:</p>
<script type="math/tex; mode=display">a(n) = \sum_{k = 0}^n {n \choose k} b(k)</script>
<p>if and only if</p>
<script type="math/tex; mode=display">b(n) = \sum_{k = 0}^n {n \choose k} (-1)^{n - k} a(k).</script>
<p>That is, the <a href="https://oeis.org/wiki/Binomial_transform"><em>binomial transform</em></a>
has a simple and unique inverse. We have proven that</p>
<script type="math/tex; mode=display">g(n) = \sum_{j = 0}^n {n \choose j} (-1)^{n - j} f(0, j),</script>
<p>for $n \geq 1$, which can be inverted to the equation</p>
<script type="math/tex; mode=display">\begin{equation}
f(0, n) = \sum_{j = 0}^n {n \choose j} g(j),
\end{equation}</script>
<p>valid for $n \geq 1$. Note that this holds for $n = 0$ as well, for $f(0, 0)
= g(0)$. Our initial equality—that seemingly had no direct way forward—has
been solved.</p>
<p>Let’s try it out on our original problem. I noticed that the $g$ function was
$g(n) = 2^{3(n + 1)}$. That means that, according to our results,</p>
<script type="math/tex; mode=display">\begin{equation*}
f(0, n) = \sum_{k = 0}^n {n \choose k} 2^{3(k + 1)} = 9^k \cdot 8.
\end{equation*}</script>
<p>Amazing.</p>
<h1 id="a-step-further">A step further</h1>
<p>We “unrolled” our recurrence for $A_n(x)$ down to $0$, but we could have
stopped earlier. In fact, for $1 \leq m \leq n$, we would obtain</p>
<script type="math/tex; mode=display">A_n(x) = (1 - x)^{n - m} A_m(x).</script>
<p>Going through the same argument in the previous section would tell us that</p>
<script type="math/tex; mode=display">g(n) = \sum_j {n - m \choose j} (-1)^{n - m - j} f(m, m + j),</script>
<p>and then a quick binomial inversion yields</p>
<script type="math/tex; mode=display">f(m, m + n) = \sum_j {n - m \choose j} g(j).</script>
<p>This formula is only valid for $1 \leq m \leq n$, so it isn’t <em>that</em>
interesting. We can only compute things like $f(n, 2n)$ and up, which isn’t
quite a complete description of $f(n, k)$. But this seems like a good place to
stop.</p>
<h2 id="recap">Recap</h2>
<p>We began with a problem involving difference tables. This was easily translated
into a two-variable recurrence with some natural choices for generating
functions. The resulting generating functions gave us some interesting
equalities, which wound up being invertible binomial transforms. What are the
big takeaways here?</p>
<ol>
<li>
<p>Difference tables are amenable to attack by generating functions. In
particular, <em>the first diagonal of a difference table is the binomial
transform of the first row</em>.</p>
</li>
<li>
<p>Useless-looking equalities can sometimes be quite helpful. The binomial
transform is one of those. (For more examples, see the <a href="https://en.wikipedia.org/wiki/M%C3%B6bius_function">Möbius
function</a> in number
theory, which is famous for inverting sums over integer divisors.)</p>
</li>
<li>
<p>Sequences are uniquely determined by the first diagonal of their difference
tables.</p>
</li>
</ol>Robert Dougherty-Blissrobert.w.bliss@gmail.comWhile going over some algebra exercises with a computer, I became interested in a particular sequence. The sequence begins $8$, $72$, $648$. (It is the number of units in the Gaussian integers modulo $3^k$.) I started playing around with it and noticed a curious property of its difference table: the first entry of every row was a power of $2$. The sequence itself ended up being fairly simple to guess—it’s just $9^k \cdot 8$—but my experiments made me wonder what a proof would look like if we just knew that first entry. There is a fairly well-known way to do this via the binomial transform, which I have joyfully rediscovered.