summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndré Nusser <andre.nusser@googlemail.com>2020-02-01 23:41:11 +0100
committerAndré Nusser <andre.nusser@googlemail.com>2020-02-01 23:41:11 +0100
commit108435bab43c109d3bfc9ffd9432a66706a18345 (patch)
tree48d777d4dfa6c2773939e34cec4ed1de21aa4bc6
parent89adce51567e5875eb285f4ded1638d3aba2b58f (diff)
Continue resolving TODOs.
-rw-r--r--sampling_alg_lac2020/LAC-20.tex34
1 files changed, 29 insertions, 5 deletions
diff --git a/sampling_alg_lac2020/LAC-20.tex b/sampling_alg_lac2020/LAC-20.tex
index 285323e..f509237 100644
--- a/sampling_alg_lac2020/LAC-20.tex
+++ b/sampling_alg_lac2020/LAC-20.tex
@@ -152,6 +152,8 @@
\usepackage{csquotes}
% Comment in the next line to draw frames around all the layout boxes for checking where they are violated.
%\usepackage{showframe}
+\usepackage{algorithm}
+\usepackage{algpseudocode}
% nice theorem and proof environments. taken from Ema's template
\theoremstyle{plain}
@@ -276,7 +278,7 @@
\todobent{Discuss DGs old sampling algorithm briefly.}
\subsection{Our Contribution}
-\todoandre{The main points are: Identify important aspects that sampling algorithms in this setting have to fulfill; Suggest a new algorithm based on those requirements; Implement and conduct experiments on this implementation.}
+To the best of our knowledge, this is the first academic article that deals with the issue of selecting samples from a set of samples with \enquote{continuous power values}. To this end, we first identify important aspects that sampling algorithms in this setting have to fulfill. After we formulate those requirements and formalize them to some degree, we present our algorithm based on those requirements, which is based on computation of a multi-criteria objective function. Consequently, we give an overview over an implementation of this approach and then conduct experiments to evaluate the actual quality. As reference implementation, we use the old method by DrumGizmo -- an open source drum machine.
\section{Preliminaries}
\todobent{Talk about how the drum kit samples are usually created; very briefly.}
@@ -288,7 +290,7 @@
\subsection{Notation and Terminology}
We use the following notation throughout this article. An \emph{instrument} is considered to be one of the drums of the drum kit that we sampled. A \emph{sample} (denoted by $s, s', \dots$) is recording of one hit on a specific instrument. The \emph{power} of a sample (denoted by $p, p', \dots$) is the perceived loudness and can be expressed by any common loudness measure of an audio clip. With the term \emph{velocity} (denoted by $v, v', \dots$), we refer to the attack velocity of a MIDI note and it is thus between 0 and 127. We consider time in a discretized way and thus a \emph{time point} is an integer value intuitively referring to the number of time steps passed since the beginning of time. For a sample $s$, we refer with $t_s$ to the time point at which the sample was played last.
-\section{Requirements}
+\section{Requirements} \label{sec:requirements}
% \todoandre{Intuitively discuss the requirements of a good sampling algorithm.}
We now discuss which requirements a good sampling algorithm intuitively has to fulfill. Such an algorithm has a tradeoff between two main objectives: choosing a sample which is close to the requested power value, while not choosing the same sample too close to the previous time it was used. Note that if we just want to be as close as possible to the requested power value, then we would always just choose the closest sample. However, if we now play a sequence of the same instrument at the same power level, then we play the same sample and thereby obtain a robotic sound. Thus, we want to find other samples that are not too far.
@@ -305,16 +307,38 @@ More concretely, we aim to fulfill the following requirements with our proposed
We now formalize the requirements stated above. Let $p, p'$ be two power levels. We define their dissimilarity to simply be their distance $\abs{p - p'}$. Thus, if $p$ is the input power value and $p'$ is the power value of the chosen sample, we want to minimize the above term. Let $s$ be a sample and $t_s$ the time point it was played last. When we are now queried for a sample at time $t'$, then for $s$ to be a good sample, we want $\abs{t_s - t'}$ to be reasonably high. Again, we just use the distance between the current time step and the last time step a sample was used. Randomization is difficult to formalize in a simple way in this context, thus, we simply require that for the same history, different outcomes of choosing a sample should be possible. The last requirement we also state in a more intuitive than formal way. Assume we are requested a sample for the power value $p$ and the two samples $s, s'$ have a very similar power value. Then, if we exchange $t_s$ and $t_{s'}$, the probability of choosing $s$ over $s'$ should be roughly the same as if we don't exchange them.
\section{Algorithm}
-\todoandre{We have a multi-criteria optimization!}
-\todoandre{Talk about the general idea of a potential function.}
+Before we start describing the approach that we decided for, let us first discuss other approaches that might seem promising at this point of the discussion. \todoandre{Also quickly discuss other alternative models that are following naturally from above observations; but wait until Bent wrote the related work part to make sure to be consistent with that.}
+
+% \todoandre{We have a multi-criteria optimization!}
+The requirements mentioned in Section \ref{sec:requirements} consist of several different objectives. Thus, we are dealing with a multi-objective optimization, where we have to somehow combine the different objectives into one. As we are dealing with an interactive setting where the quality of the solution of this optimization is determined by the user, it seems natural to make the algorithm parametrized. By these parameters, the user can influence how the different objectives are weighted and thereby influence the final result just having a very rough intuitive and non-technical understanding of the process.
+
+% \todoandre{Talk about the general idea of a potential function.}
+Following from the above description, we choose a single objective function that we optimize. This objective function consists of the terms described in Section \ref{sec:requirements} as well as the parameters that are factors in front of the requirements terms. We formulate our objective function such that smaller values are better and thus we end up with a minimization problem. Thus, we just have to evaluate a single term on each sample and then pick the one which has the best value.
+
\todoandre{Introduce the potential function of the algorithm.}
+We now give the objective function. Let $p$ be the power that is requested. As before, for any sample $s$, let $t_s$ be the time at which $s$ was played last (the unit does not matter as it is parametrized by $\beta$ anyway), and let $r(s,t)$ be a random number generator producing numbers in the range $[0,1]$ uniformly and independently at random. At the current time $t$, we now want to find the sample $s$ minimizing the objective function
+\[
+ f(s, t) \coloneqq \alpha \cdot (p-p_s)^2 + \beta \cdot (t_s - t)^{-2} + \gamma \cdot r(s,t),
+\]
+where $p_s$ is the power of sample $s$. We have to ensure that $t_q \neq t$ to avoid division by zero.\todo{Adapt to what is actually in the code. Also introduce the parameters $\alpha, \dots$ beforehand.}
+
\todoandre{Talk about the single terms of the potential function.}
\todoandre{Maybe add some pseudo-code to make things easier to understand?}
+\begin{algorithm}
+\begin{algorithmic}
+ \State bla
+\end{algorithmic}
+\caption{This is the high-level code of the sampling function.}
+\label{alg:sampling}
+\end{algorithm}
+
+Note that the worst-case complexity of evaluating the objective function is linear in the number of samples for the instrument that we are considering. However, in practice we can avoid evaluation for most samples by simply starting with the \enquote{most promising} sample and recursively evaluate the neighbors until the future possible evaluations cannot beat the currently best value.
+
\section{Implementation}
\todobent{Give a short introduction to DrumGizmo, including a link to the git repository.}
\todo{Talk about the timeline, i.e., when were the releases and what is still unreleased?}
-\todoandre{Talk about how the sampling algorithm was implemented}
+\todoandre{Talk about how the sampling algorithm was implemented. What do we need to store?}
\todoandre{Add some of the source code to the paper?}
\todoandre{Give less important implementation details, e.g., like adaptive search starting from the most promising value}