summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndré Nusser <andre.nusser@googlemail.com>2020-02-02 17:08:31 +0100
committerAndré Nusser <andre.nusser@googlemail.com>2020-02-02 17:08:31 +0100
commit0e39ed42389067ac6e2c1ce91ceafd1731fac460 (patch)
treeaa1a06c60f5faae9a375e8af8a78acfaea33bfbf
parentf11258049b9f3676e4d11c498e590d983f488e3e (diff)
Continued.
-rw-r--r--sampling_alg_lac2020/LAC-20.tex33
1 files changed, 24 insertions, 9 deletions
diff --git a/sampling_alg_lac2020/LAC-20.tex b/sampling_alg_lac2020/LAC-20.tex
index c8d0230..1042e66 100644
--- a/sampling_alg_lac2020/LAC-20.tex
+++ b/sampling_alg_lac2020/LAC-20.tex
@@ -306,7 +306,7 @@ More concretely, we aim to fulfill the following requirements with our proposed
\item[Locality:] If two samples have a very similar power value, they should also be treated similarly by the algorithm. In other words, locally, samples should have almost the same probability of being chosen.
\end{description}
-We now formalize the requirements stated above. Let $p, p'$ be two power levels. We define their dissimilarity to simply be their distance $\abs{p - p'}$. Thus, if $p$ is the input power value and $p'$ is the power value of the chosen sample, we want to minimize the above term. Let $s$ be a sample and $t_s$ the time point it was played last. When we are now queried for a sample at time $t$, then for $s$ to be a good sample, we want $t - t_s$ to be reasonably high. Again, we just use the distance between the current time step and the last time step a sample was used. Randomization is difficult to formalize in a simple way in this context, thus, we simply require that for the same history, different outcomes of choosing a sample should be possible. The last requirement we also state in a more intuitive than formal way. Assume we are requested a sample for the power value $p$ and the two samples $s, s'$ have a very similar power value. Then, if we exchange $t_s$ and $t_{s'}$, the probability of choosing $s$ over $s'$ should be roughly the same as if we don't exchange them.
+We now formalize the requirements stated above. Let $p, p'$ be two power levels. We define their dissimilarity to simply be their distance $\abs{p - p'}$. Thus, if $p$ is the input power value and $p'$ is the power value of the chosen sample, we want to minimize the above term. Let $s$ be a sample and $t_s$ the time point it was played last. When we are now queried for a sample at time $t$, then for $s$ to be a good sample, we want $t - t_s$ to be reasonably high. Again, we just use the distance between the current time step and the last time step a sample was used. Randomization is difficult to formalize in a simple way in this context, thus, we simply require that for the same history, different outcomes of choosing a sample should be possible. The last requirement we also state in a more intuitive than formal way. Assume we are requested a sample for the power value $p$ and the two samples $s, s'$ have a very similar power value. Then, if we exchange $t_s$ and $t_{s'}$, the probability of choosing $s$ over $s'$ should be roughly the same as if we do not exchange them.
\section{Algorithm}
Before we start describing the approach that we decided for, let us first discuss other approaches that might seem promising at this point of the discussion. \todoandre{Also quickly discuss other alternative models that are following naturally from above observations; but wait until Bent wrote the related work part to make sure to be consistent with that.}
@@ -325,21 +325,21 @@ We now give the objective function. Let $p$ be the power that is requested. As b
Note that we have to ensure $t_s \neq t$ to avoid division by zero.
% \todoandre{Talk about the single terms of the potential function.}
-Let us now consider the objective function in more detail. There are three terms in the sum and we will have a closer look at them in order.
+Let us now consider the objective function in more detail. The objective function consists of three summands and we will have a closer look at them in order.
-The first term, namely
+The first summand, namely
\[
\alpha \cdot \left( \frac{p-p_s}{p_{\max} - p_{\min}}\right)^2,
\]
is for expressing the distance of the sample's power value $p_s$ to the desired power value $p$. Instead of using the absolute value $\abs{p-p_s}$ as discussed in Section \ref{sec:requirements}, we first normalize the value to be in the range $[0,1]$ and then square it. By squaring, we put a significantly stronger penalty than the absolute value on samples whose power value is very far from the requested power value. Note that if we request a power value that lies in the middle of the power values of our sample set, then this term will maximally be around $\frac{1}{4}\alpha$. However, if $p = p_{\max}$ or $p = p_{\min}$, then we might obtain a value of $\alpha$. While this might seem unreasonable at first glance, we want to highlight that this indeed matches the desired behavior, because the penalty should be only dependent on the distance to the sample and not the possible worst case value it can attain. In other words, a bad choice should not be make to look better, if there are much worse choices possible.
-The second term, namely
+The second summand, namely
\[
\beta \cdot \left( 1 + \frac{t - t_s}{S}\right)^{-1},
\]
-expresses how much time passed since we last played the sample for which we are currently evaluating the objective function. However, we want large values if little time passes and therefore we raise the whole term to the power of $-1$. To avoid extreme values, we add 1 to the normalized distance of the current time and the last time the sample was played. Note that if we wouldn't have a \enquote{+1}, then for $t - t_s$ being very small, the values of this term would be huge and thus dominate the whole objective function. We normalize by the sample rate, as we want the time distance to be in seconds and not in samples.
+expresses how much time passed since we last played the sample for which we are currently evaluating the objective function. However, we want large values if little time passes and therefore we raise the whole term to the power of $-1$. To avoid extreme values, we add 1 to the normalized distance of the current time and the last time the sample was played. Note that if we would not have a \enquote{+1}, then for $t - t_s$ being very small, the values of this term would be huge and thus dominate the whole objective function. We normalize by the sample rate, as we want the time distance to be in seconds and not in samples.
-The third therm, namely
+The third summand, namely
\[
\gamma \cdot r(s,t),
\]
@@ -376,17 +376,32 @@ Note that the worst-case complexity of evaluating the objective function is line
\todobent{Give a short introduction to DrumGizmo, including a link to the git repository.}
\todo{Talk about the timeline, i.e., when were the releases and what is still unreleased?}
-\todoandre{Talk about how the sampling algorithm was implemented. What do we need to store?}
+% \todoandre{Talk about how the sampling algorithm was implemented. What do we need to store?}
+The sampling algorithm itself did not require any particular implementation efforts. Most of the time was spent on the theoretical part of it. To give a better overview over the technicalities, we briefly list the information that needs to be stored. In a preprocessing phase, we compute $p_{\min}$ and $p_{\max}$ for each instrument and set all values of the $\mathit{last}$ arrays to 0. The power values of the samples are given by the drum kit in DrumGizmo. The parameters $\alpha, \beta, \gamma$ have default values that were determined experimentally. Each of them can be changed by the user, either interactively in the GUI or via the command line interface.
+% \todoandre{Add some of the source code to the paper?}
+As DrumGizmo is free software, the exact details of the implementation can be checked by everyone.
-\todoandre{Add some of the source code to the paper?}
-\todoandre{Give less important implementation details, e.g., like adaptive search starting from the most promising value}
+% \todoandre{Give less important implementation details, e.g., like adaptive search starting from the most promising value}
+For instruments with reasonably small sample sets, simply iterating over all samples for the specific instrument as shown in Algorithm \ref{alg:sampling} is sufficiently performant. However, imagine an instrument with an extremenly large sample set. As DrumGizmo drum kits can be created by everyone, there is no restriction and we cannot assume a small sample size. To avoid performance issues arising from such a scenario, we employ a non-naive search by starting with the \enquote{most promising} sample and then inspecting its neighbors until the currently best sample is known to dominate the remaining samples. More concretely, we do the following: we start with the sample $s$ that has the closest power value $p_s$ to the requested power value $p$, i.e., we find $s$ that minimizes $\abs{p - p_s}$. The key observation why a local search often suffices is that we can lower bound the second and third summand of the objective function by 0. Thus, for a given sample $s$ and a time point $t$, we have
+\begin{equation}\label{eq:lower_bound}
+ f(s,t) \geq \alpha \cdot \left( \frac{p-p_s}{p_{\max} - p_{\min}}\right)^2.
+\end{equation}
+Assume now that, for some $x > 0$, we evaluated all samples $s$ with $p-x \leq p_s \leq p+x$. Then we know, by Equation \ref{eq:lower_bound}, that for all samples $s'$ with power values outside the range $[p-x, p+x]$ it holds that
+\[
+ f(s',t) > \alpha \cdot \left( \frac{x}{p_{\max} - p_{\min}}\right)^2.
+\]
+Note that by traversing the samples in order of their distance to $p$ (which is possible by having the samples stored in an array sorted increasingly by power value), we can stop the search as soon we searched the range $[p-x, p+x]$ for which
+\[
+ f_{\min} < \alpha \cdot \left( \frac{x}{p_{\max} - p_{\min}}\right)^2.
+\]
\section{Experiments}
\todoandre{Talk about the setup.}
\todoandre{Talk about what the experiments should show: two close samples are chosen similarly often; playing the same MIDI note plays a reasonably varied sample set; average distance of one sample}
\todoandre{Experiments are: playing fast sweeps (with multiple hits per velocity); playing a single note over and over again at the same velocity; sound examples that people can listen to online?}
\todoandre{Do beautiful tables and plots here}
+\todoandre{Also do an experiment regarding the adaptive search}
\todoandre{Summarize experiments}
\section{Conclusion and Future Work}