summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndré Nusser <andre.nusser@googlemail.com>2020-02-10 11:44:55 +0100
committerAndré Nusser <andre.nusser@googlemail.com>2020-02-10 11:44:55 +0100
commit4c653f1fc5102bf5b86fa72d225371a257992597 (patch)
treef0cca959dcfbdf0d7383242be8d109b7e6a89e82
parent7933b6b20f293d561e185c03a28fa347761151c4 (diff)
Finish everything except introduction.
-rw-r--r--sampling_alg_lac2020/LAC-20.tex40
1 files changed, 19 insertions, 21 deletions
diff --git a/sampling_alg_lac2020/LAC-20.tex b/sampling_alg_lac2020/LAC-20.tex
index e1f3174..7e8846b 100644
--- a/sampling_alg_lac2020/LAC-20.tex
+++ b/sampling_alg_lac2020/LAC-20.tex
@@ -455,10 +455,10 @@ More concretely, we aim to fulfill the following requirements with our proposed
We now formalize the requirements stated above. Let $p, p'$ be two power levels. We define their dissimilarity to simply be their distance $\abs{p - p'}$. Thus, if $p$ is the input power value and $p'$ is the power value of the chosen sample, we want to minimize the above term. Let $s$ be a sample and $t_s$ the time point it was played last. When we are now queried for a sample at time $t$, then for $s$ to be a good sample, we want $t - t_s$ to be reasonably high. Again, we just use the distance between the current time step and the last time step a sample was used. Randomization is difficult to formalize in a simple way in this context, thus, we simply require that for the same history, different outcomes of choosing a sample should be possible. The last requirement we also state in a more intuitive than formal way. Assume we are requested a sample for the power value $p$ and the two samples $s, s'$ have a very similar power value. Then, if we exchange $t_s$ and $t_{s'}$, the probability of choosing $s$ over $s'$ should be roughly the same as if we do not exchange them.
\section{Algorithm}
-Before we start describing the approach that we decided for, let us first discuss other approaches that might seem promising at this point of the discussion. \todoandre{Also quickly discuss other alternative models that are following naturally from above observations; but wait until Bent wrote the related work part to make sure to be consistent with that.}
+% Before we start describing the approach that we decided for, let us first discuss other approaches that might seem promising at this point of the discussion. \todoandre{Also quickly discuss other alternative models that are following naturally from above observations; but wait until Bent wrote the related work part to make sure to be consistent with that.}
% \todoandre{We have a multi-criteria optimization!}
-The requirements mentioned in Section \ref{sec:requirements} consist of several different objectives. Thus, we are dealing with a multi-objective optimization, where we have to somehow combine the different objectives into one. As we are dealing with an interactive setting where the quality of the solution of the optimization is determined by the user, it seems natural to make the algorithm parametrized and expose the parameters to the user. Using these parameters, the user can influence how the different objectives are weighted and thereby influence the final result, just having a very rough intuitive and non-technical understanding of the process.
+In this section we discuss the new algorithm that we suggest for sample selection. The requirements mentioned in Section \ref{sec:requirements} consist of several different objectives. Thus, we are dealing with a multi-objective optimization, where we have to somehow combine the different objectives into one. As we are dealing with an interactive setting where the quality of the solution of the optimization is determined by the user, it seems natural to make the algorithm parametrized and expose the parameters to the user. Using these parameters, the user can influence how the different objectives are weighted and thereby influence the final result, just having a very rough intuitive and non-technical understanding of the process.
% \todoandre{Talk about the general idea of a potential function.}
Following from the above description, we choose a single objective function that we optimize. This objective function consists of the terms described in Section \ref{sec:requirements} as well as the parameters that are factors in front of the requirements terms. We formulate our objective function such that smaller values are better and thus we end up with a minimization problem. Consequently, we just have to evaluate a single term on each sample and then pick the one which has the best value.
@@ -528,19 +528,12 @@ First, note that all extreme choices of the parameters -- meaning that we set on
\paragraph{Random Selection.} If now $\gamma > 0$ and $\alpha = \beta = 0$, then the objective function reduces to the third summand and we thus always select a sample uniformly at random.
-\paragraph{Round Robin.} The previously mentioned emulations were straight forward, however, the arguably most commonly used sample selection algorithm in practice is round robin. As already discussed in Section \ref{sec:introduction}, round robin assumes the samples to already be grouped. In our case this means that samples $s_1, \dots,s_k$ that belong to the same velocity group should all have the same power value, i.e., $p_{s_1} = \cdots = p_{s_k}$. Now if there is a query with power value $p$, we want to always choose the closest group of samples, thus $\alpha$ should be huge. After restricting to the samples of a specific group, we now always want to play the oldest sample, thus we simply want $\beta > 0$. If we additionally want to randomize round robin in a way that we sometimes choose the second or third oldest sample, then we want to set $\gamma$ to a small value greater than zero. \todoandre{improve last few sentences.}
+\paragraph{Round Robin.} The previously mentioned emulations were straight forward, however, the arguably most commonly used sample selection algorithm in practice is round robin. As already discussed in Section \ref{sec:introduction}, round robin assumes the samples to already be grouped. In our case this means that samples $s_1, \dots,s_k$ that belong to the same group should all be assigned the same power value, i.e., $p_{s_1} = \cdots = p_{s_k}$. Now, if there is a query with power value $p$, we want to always choose the closest group of samples, thus $\alpha$ should be large compared to $\beta$ and $\gamma$, e.g. $\alpha = 1$. After restricting to the samples of a specific group, we now always want to play the oldest sample, thus we simply want a small value $\beta > 0$, say $\beta = 1/1000$. If we additionally want to randomize round robin in a way that we sometimes choose the second or third oldest sample, then we want to set $\gamma$ to a small value smaller than but in a similar range as $\beta$, say $\gamma = 1/4000$.
\section{Implementation} \label{sec:implementation}
%\todobent{Give a short introduction to DrumGizmo, including a link to the git repository.}
-DrumGizmo is an open source, cross-platform, drum sample engine and
-audio plugin aiming to give the best possible sound from the least
-effort when doing drum programming.
-The source-code is available though git at:\\
-\verb|git://git.drumgizmo.org/drumgizmo.git|\\
-and the source code can be browsed online here:\\
-\verb|http://cgit.drumgizmo.org/drumgizmo.git/|
-
-\todo{Talk about the timeline, i.e., when were the releases and what is still unreleased?}
+We added our new sampling algorithm to DrumGizmo, replacing the one it previously used. DrumGizmo is an open source, cross-platform, drum sample engine and
+audio plugin aiming to give an output that is as close to a real drummer as possible.\footnote{The source-code is available through git at \url{git://git.drumgizmo.org/drumgizmo.git}, and the source code can be browsed online at \url{http://cgit.drumgizmo.org/drumgizmo.git/}.}
% \todoandre{Talk about how the sampling algorithm was implemented. What do we need to store?}
The sampling algorithm itself did not require any particular implementation efforts. Most of the time was spent on the theoretical part of it. To give a better overview over the technicalities, we briefly list the information that needs to be stored. In a preprocessing phase, we compute $p_{\min}$ and $p_{\max}$ for each instrument and set all values of the $\mathit{last}$ arrays to 0. The power values of the samples are given by the drum kit in DrumGizmo. The parameters $\alpha, \beta, \gamma$ have default values that were determined experimentally. Each of them can be changed by the user, either interactively in the GUI or via the command line interface.
@@ -570,7 +563,7 @@ We want to evaluate how the sample selection algorithm performs in practice, the
We do experiments on three different drum kits.
As normally the snare drum is the instrument with the highest number of samples, we choose this drum for our experiments.
% See Table \ref{tab:drumkit_data} for some information about the samples of the drum kits and see Figure \ref{fig:power_level_distribution} for a visualization of the power level distribution of the kits.
-For our experiments, we use the Crocell kit, which has 98 snare samples. See Figure \ref{fig:power_level_distribution} for a visualization of the power level distribution of the Crocell kit.
+For our experiments, we use the Crocell kit, which has 98 snare samples. See Figure \ref{fig:power_level_distribution} for a visualization of the power level distribution of the Crocell kit. In this plot, we can see that the sampling is heavy on the more quiet samples and significantly sparser on the louder samples. We ask the reader to keep this distribution in mind for the remainder. Due to this distribution, we expect that when playing notes all over the power spectrum, the number of times each individual sample is played will be higher for louder samples as there are significantly less to choose from.
% \begin{table}
% \caption{Drum kit data.}
@@ -610,7 +603,7 @@ To test the above hypotheses, we conduct the following experiments:
\end{enumerate}
% \todoandre{Do beautiful tables and plots here}
-To be able to compare the results of those experiments, we have to somehow visualize the selections. We choose to plot histograms. More concretely, for a run of the experiment, we plot the number of times a sample was selected over its index in the array of ordered samples. For the first experiment, i.e., the sweeps, you can see the histograms in Figure \ref{fig:sweep}. \todo{describe and explain plot}
+To be able to compare the results of those experiments, we have to somehow visualize the selections. We choose to plot histograms. More concretely, for a run of the experiment, we plot the number of times a sample was selected over its index in the array of ordered samples. For the first experiment, i.e., the sweeps, you can see the histograms in Figure \ref{fig:sweep}. The plot of the old sampling algorithm clearly is less homogeneous and has larger variance than the plot of the new sampling algorithm. Especially, the old sampling algorithm uses several samples a lot while barely using others, which can lead to a more robotic sound. Especially, it seems a waste of storage and sampling effort to not use the full potential of the data set.
\begin{figure*}
\includegraphics[width=.47\linewidth]{figures/sweep_old.pdf}
@@ -620,7 +613,7 @@ To be able to compare the results of those experiments, we have to somehow visua
\label{fig:sweep}
\end{figure*}
-The second experiment we conducted for two different MIDI velocities: MIDI velocity 80 (Figure \ref{fig:80}) and MIDI velocity 112 (Figure \ref{fig:112}). \todo{describe and explain plot}
+The second experiment we conducted for two different MIDI velocities: MIDI velocity 80 (Figure \ref{fig:80}) and MIDI velocity 112 (Figure \ref{fig:112}). Let us first discuss Figure \ref{fig:80}. The most significant shortcoming of the old algorithm is that the histogram does \emph{not} roughly resemble a bell curve. Instead, we have some samples that are barely used which are very close to samples which are used predominantly. Especially, the property that samples with similar power values are used similarly often is violated. The new algorithm clearly improves on these shortcomings and gives us the desired result. In Figure \ref{fig:112}, we can see a situation where there are just a few samples available. Here, the old algorithm and the new algorithm perform similarly, except that the new algorithm again distributes the usage of the samples more evenly. However, more importantly, even though the new algorithm does not specifically restrict to a specific power value range, it still does not choose samples that are far away from the desired power value.
\begin{figure*}
\includegraphics[width=.47\linewidth]{figures/80_old.pdf}
@@ -640,7 +633,8 @@ The second experiment we conducted for two different MIDI velocities: MIDI veloc
% \todoandre{Also do an experiment regarding the adaptive search}
To also get an idea of the performance of the new sampling algorithm, we want to see how many power values of samples are evaluated per query.
-Without the smart search optimization described at the end of Section \ref{sec:implementation}, this number would always be the number of samples. However, we expect that the smart search optimization significantly reduces the number of evaluations. To test this hypothesis, we take the above experiment and look at the number of evaluations. You can see some data regarding the number of evaluations in Table \ref{tab:evaluation_count}.
+Without the smart search optimization described at the end of Section \ref{sec:implementation}, this number would always be the number of samples. However, we expect that the smart search optimization significantly reduces the number of evaluations. To test this hypothesis, we take the above experiment and look at the number of evaluations.
+Recall that the Crocell kit contains 98 snare samples and thus naively evaluating the objective function would lead to 98 evaluations. You can see the outcome of the experiment in Table \ref{tab:evaluation_count}. The mean number of evaluations is significantly less (at most 14!) than the worst-case 98 evaluations per sample selection and also the variance is very low, meaning that even for large sample sets, this sample selection algorithm should work well in real-time scenarios.
% \begin{figure}
% % \includegraphics[width=.8\linewidth]{figures/evaluations_histogram.pdf}
@@ -649,7 +643,7 @@ Without the smart search optimization described at the end of Section \ref{sec:i
% \end{figure}
\begin{table}
-\caption{Number of evaluations per query.}
+\caption{Number of evaluations per query. The experiments with the numbers (16, 48, 80, 112) are the experiments from above of repeatedly playing a MIDI note. The number gives the MIDI velocity of this MIDI note.}
\label{tab:evaluation_count}
\centering
\begin{tabular}{|l||rrrrr|}
@@ -662,12 +656,16 @@ stddev evaluations & 2.47 & 0.19 & 1.53 & 0.50 & 0.00 \\
\end{tabular}
\end{table}
-\todoandre{Summarize experiments}
+In summary, the experiments show that the new sampling algorithm is clearly superior to the old method that was employed in DrumGizmo and fulfills all the requirements formulated in Section \ref{sec:requirements}.
\section{Conclusion and Future Work}
-\todoandre{Recapitulate what was done in this paper. Highlight some of the difficulties and surprises.}
-\todoandre{List future work: transforming the loudness space; refine the objective function; adapt algorithm to other instruments/settings; study to see what sounds good to people and do they actually hear the difference?}
-\todoandre{Change diversity parameter. Add a bound on the deviation from the requested power value to definitely avoid extrem outliers.}
+% \todoandre{Recapitulate what was done in this paper. Highlight some of the difficulties and surprises.}
+This article presented a new algorithm for choosing samples in a setting where the samples are annotated with power values, with the desired behavior of choosing samples close to the input power value while having a reasonable diversity and no significant patterns in the sequence of selected samples. We first formulated the requirements, which lead us to an algorithm that we added to DrumGizmo. Through experiments we showed clear improvements over the old method and also the fulfillment of the desired requirements.
+
+However, there are still some open problems and directions to be addressed in future work. First, in this article we assumed the power values of the samples to be given. We could alter the setting by allowing a transformation on the power values, thus, for example, distributing them better over the power range. Second, the objective function that we used could still be further refined. For example, the diversity term (controlled by parameter $\beta$) could be modified to have a relatively larger penalty for samples that were played very recently, e.g., one could move away from just using linear terms. Third, having a more general view, one could try to adapt our work to instruments that are significantly different from drums. And finally, the real quality of this work should be determined by the users themselves. In this direction, a user study could lead to insights about how to further refine our approach.
+
+% \todoandre{List future work: transforming the loudness space; refine the objective function; adapt algorithm to other instruments/settings; study to see what sounds good to people and do they actually hear the difference?}
+% \todoandre{Change diversity parameter. Add a bound on the deviation from the requested power value to definitely avoid extrem outliers.}
% \section{Acknowledgements}
% \todo{Thank people for testing?}