Progress Reports | Black Box EECS351W19

Concerns about limitation of existing hardware

rjpat
Apr 5, 2019
3 min read

Our whole projects relies around three core characteristics: our sampling frequency, the speed of sound in a given medium and the distance between our sensors. In our original proof of concept we provided an ideal testing scenario. Our sensors were almost a meter apart, sampled at 96Kps and tested in open air, quiet room. Based on the paper Implementing a Microphone Array on a Mobile Robotic Device by Brian Blosser and Matt Meshulam, a good benchmark for the number of potential angles (T) that the sensor array can receive is T=(d*Fs)/s where d is the distance between sensors, Fs is the sampling frequency and s is the speed of sound in a medium. Based on our original proof of concept where d=0.9m, Fs = 96KHz, and s = 343m/s the number of potential angles between 0 and 90 degrees is 252 distinct angles. For our benefit, that means we could project the angle down to a specificity as high as 0.358 degrees.

However, if we were to take the current testing array for the boat application, many of the characteristics of sensor arra change. For one, due to the limited width of the hull of 40cm, our d value becomes 0.1m. Subsequently, the speed of sound in water is substantially faster than that of the speed of sound in air. This changes based on a number of factors including senility and senility, but for the purposes of rough calculations, our s value increases from 343m/s to 1500m/s. Assuming the same sampling frequency, our number of potential distinct angles decreases to only 6.4. As a result, our maximum potential specificity decreases from 0.358 degrees to 14.0625 degrees increments. While this could be acceptable for determining one of three docking locations across a field of view as high as 120 degrees, we face another unforeseen issue.

For our original testing set up, we used 3 standard sound cards sampling at 96Ksp. In addition, the "signal" we used was that of a clapping hand. Two glaring errors are present with this set up. The first, is that the sound of hands clapping includes a large number of frequencies that are audible by the human ear. Today, the vast majority of sound cards are designed for the processing of audio signals audible to humans. This range typically falls between 20Hz and 20KHz. However, our ultimate goal is to detect the ultrasonic pulse emitted by a black box. The range of this signal is between 20KHz and 45KHz. While the hydrophones we are using can detect up to 100KHz, the sound card remains an issue.

Looking Ahead

Looking ahead to the next three weeks there are a few tasks at head. We are going to need to increase our sampling rate, and increase our ability to listen to higher frequencies. While we can get access to a sound card that can sample at 192Kps that would increase our specificity to 7.03 degrees per measurement, this does not allow us to detect higher frequencies. After a conversation with Professor Achilles,our steps moving forward are clearer. The first is the ability to utilize existing sound cards, it is the expectation of Professor Achilles that the sound cards are filtered within their hardware to listen to frequencies below 20KHz and as a result, cannot be retrofit for our purpose. However, there are several professors at Michigan that design systems to listen at these frequencies that may be able to help us. Alternatively, there may be a way to modulate the pinger frequencies to a lower frequencies audible to our sound card. Lastly, there exist several off the shelf solutions designed for listening to dolphins or bats that can process higher frequencies that could be retrofit for our purposes. However, more research is still needed on the viability of those options. In the meantime, our best course of action is to continue to test in open air in order to develop our system.

Moreover, another task we'll need to achieve is being able the measure the angle and range in the air. Despite the fact that for now our current course is to test in the open air, another necessary task is to obtain water data so we can determine the feasibility of designing our system.

Additional Plots/Data and DSP Tool Finding

ahmadhf4
Mar 29, 2019
2 min read

Plots of our time-domain, power spectral density and spectrogram are provided below. The time-domain plot represents a three microphone set-up we had conducted in which we clapped from different sides of the room. We see spikes to indicate when the microphone has received the clap. In the next plot, we have the power spectrum for all three signals. Essentially, it tells you how much power or energy is contained at each frequency within a signal. The energy which is the strength of the variations will be a function of frequency, which is useful for finding the frequency variations that are strong or weak. The spectrogram as we learned in class is useful for providing a visual representation of the spectrum of frequencies of a signal as it varies in time. It will be useful because it provides us with a method of visualizing the signal strength or in our context how "loud" it is as it varies through time and frequency. The Signal Analyzer App which I mention below was extremely useful for providing these visual aids with its various features.

New DSP Tool

Working on the data so far, one tool that I've learned about is the Signal Analyzer App. It's practical because it provides an interactive and visual interface for when we'll be dealing with the comparison and measurement of different signals through either the time or frequency domain. The audio files are read using the audioread function so that they may be stored as arrays. Once they've been read, the Signal Analyzer App provides a plethora of options for the data which serve to be extremely useful . It will certainly be convenient over the next three weeks.

ree — Figure 3: Spectrogram (Middle Microphone)

ree — Figure 4: Spectrogram (Right Microphone)

ree — Figure 5: Spectrogram (Left Microphone)

Initial Project Proposal

Team Blackbox
Mar 26, 2019
6 min read

Figure 1: Sound wave representation of claps from three microphones

On Saturday, we conducted a demo to see how the proposed three-microphone set up would perform. From different positions throughout the room, we clapped to see how each microphone would pick up the sounds. A note to consider is that one of the microphones was faulty, so my Macbook acted as the central microphone, hence why the sound waves are much more amplified for that respective channel.

From the figure, each clap produced an abrupt and brief waveform. We see that when making a clapping sound from the left side of the room, the left microphone picks up the soundwave first, followed by the central microphone and then the right microphone. From the observed soundwaves we were able to observe the different delays in time of sound as a function of position, which is expected. Originally, our dataset did not demonstrate the different delays in time of the soundwaves based off the respective microphone. We had to calibrate the dataset by placing all three microphones in the same position, and found that we needed to calibrate the dataset by roughly 12 centiseconds. Once we performed this we obtained the three channels of Figure 1.

In Figure 2 below, the set-up we used to collect the data is shown. As mentioned before, we made clapping sounds in different parts of the room to emulate a black box emitting a repeated pulse. Based off the preliminary data collected using this prototype setup, we believe that the project will be feasible.

ree — Figure 2: Our proof of concept black box idea along with data collection

Our data is stored in a google drive under the link below, which has been made share-accessible.

https://drive.google.com/open?id=1JWpNGOTo4U7Q6posbl2ufL9C4foFhKlt

Files attached:

Sound wave demo pre and post calibration
Clapping video demonstration and set-up

ESPRIT-estimation of signal parameters via rotational invariance techniques R. Roy ; T. Kailath

Summary

The ESPRIT algorithm dramatically reduces computation and storage costs of DOA estimation by requiring the sensor array possess a displacement invariance; the sensors occur in matched paris with identical displacement vectors.

ESPRIT Data Model:

Take m doublets of sensors as shown in Figure 3 below. Assume there are d <= m narrow-band sources centered at frequency w0, and that the sources are far enough away from the receiving array such that the waveforms reaching the array are planar.

ree — Figure 3: Sensor array geometry for multiple source DOA estimation using ESPRIT

The ESPRIT algorithm is translationally invariant, and can be calculated with arbitrary sensor gain and phase patterns.

Discussion

ESPRIT reduces computation times by replacing the long search procedure inherent in other methods and produces signal parameter estimate in terms of generalized eigenvalues. This requires computations of the order d^3.

Professor Raviraj Adve

Department of Electrical and Computer Engineering

University of Toronto

Direction of Arrival Estimation

https://www.comm.utoronto.ca/~rsadve/Notes/DOA.pdf

Summary:

There are a multiple of available algorithms that have been developed for the purpose of solving the DIrection of Arrival (DOA) problem. Depending on the complexities and subsequent capabilities of a given use case, different algorithms may suit best. For the purpose of this paper, the consistent testing framework involves a straight line of sensors (N) along an axis, in order to determine the angle(φ) of an incoming signals (M) from a given source. The 5 techniques detailed in the DOA paper are are correlation, Maximum Likelihood, MUSIC, ESPRIT and Matrix Pencil.

Models:

Cramer-Rao Bound

The Cramer-Rao Bound (CRB) model is used to state the minimum variance of a given algorithm. The idea being, this provides a common metric to determine the merits and drawbacks of a given algorithm.

The CRB theorem: Given a length-N vector of received signals x dependent on a set of P parameters θ = [θ1, θ2, . . . , θP ] T , corrupted by additive noise (n),

x = v(θ) + n,

where v(θ) is a known function of the parameters. (Adve)

For the purpose of processing signals in a given sensor array, the model is represented as:

x = αs(φ) + n

Where s(φ) is the angle of the signal from the received source and “a” assumes the role of a nuisance variable to represent a series of unknowns about the accuracy of an algorithm.

DOA Estimation using Correlation

The Correlation model is the easiest way to measure the effectiveness of a model by determining a series of spikes centered at the predicted angle(s) a signal may be received from a source.

Where M signals are measured and plotted against potential angles (φ) where the maximum of φ=φm.

Where Pcorr(φ) is a non-adaptive estimate of the spectrum of incoming signals. As a result the M largest peaks are the predicted angles.

Algorithms:

The three primary classes of algorithms addressed in this paper are “MUSIC”, “ESPIRIT” and “Matrix Pencil”. MUSIC is followed by sub variants called Root-MUSIC and Smooth-MUSIC.

ESPIRIT was detailed above in the description of the previous paper however, MUSIC needs to still be described.

MUSIC at its core relies on matrix algebra and the assumption that the incoming signals are uncorrelated. By taking this assumption, the signal covariance can be taken and used to determine the eigenvectors corresponding to the zero eigenvalue. Once this pseudospectrum is graphed, the highest M peaks detail the approximate incoming angles of the source signals.

Matrix Pencil is different than MUSIC or ESPIRIT in the regards that it does not rely on a correlation matrix R. In addition, those algorithms require a large number of samples in order to be effective while Matrix Pencil was created with the purpose of use cases where the scenario is rapidly changing.

Maximum Likelihood Estimator (MLE) for Direction-of-Arrival Estimation

When you utilize maximum likelihood methods, you are essentially searching for the value of your parameter which has maximum likelihood. Based off the observations, its a way of estimating the parameters of a model. We have two unknown parameters, which are DOA and magnitude. To estimate ϕ, the MLE is given by the following equation, where

is the is the pdf of the data vector x given the parameters α, ϕ.

To solve for the DOA estimate, we must find the maximum of the following function below (the maximum likelihood estimate of the spectrum of the incoming data).

Something to note is that when only one user exists and the interference covariance matrix

Rn = 𝞼 ^2*I we are performing the DOA Estimation using Correlation method, which was described above. There are downfalls with this algorithm however. It's requires a lot of computational resources and it assumes we obtained the interference covariance matrix. In reality, this is not commonly the case. Moreover, estimating the covariance of the interference by itself is nearly impossible as noted in the paper.

Conclusion:

This paper ultimately concludes that each method has their merits depending on the use case. While all having similar levels of accuracy, MUSIC and ESPIRIT both require N sensors to return the data on N-1 signals. Meanwhile Matrix Pencil can only determine N/2 signals from a given array. However, Matrix Pencil does not involved the calculation of a covariance matrix.

As a result, Matrix Pencil ultimately serves as the model with the greatest advantage. This is due to its increase speed, but also increased accuracy because of the removal of the inconsistencies incorporated by calculating a covariance matrix.

Maximum Likelihood Methods for Direction-of-Arrival Estimation

P. Stoica & K.C. Sharman

This 1990 paper introduces five methods that use Maximum Likelihood method (MLM) to distinguish multiple signals received at multiple sources.

Detailed mathematical explanation of each method would involve large amount of notational clarification, hence is omitted. Below are an summary of the methods, along with their pros and cons:

Name: MLM (deterministic or conditional maximum likelihood method (MLM)

Description: Application of the ML principle to the statistics of the observed raw data.

Pro:

Con: computationally intensive, and not statistically efficient for practical cases.

Name: MUSIC-1 (multiple signal classification 1)

Description: a brute force approximation to the MLM.

Pro: computationally much simpler than the MLM

Con: provide significantly less accurate estimates

Name: MUSIC-2 (multiple signal classification 2)

Description: an improved version of MUSIC-1, obtained by applying the ML principle to the statistics of certain linear combinations of the sample noise space eigenvectors.

Pro: computationally much simpler than the MLM

Con: provide significantly less accurate estimates

Name: MODE-1 (method of direction estimation 1)

Description: a large sample realization of the ML estimator, a compromise between statistical performance of MLM and computational simplicity of MUSIC.

Pro: computationally simpler than the MLM

Con:

Name: MODE-2 (method of direction estimation 2)

Description: obtained using the ML principle on the statistics of certain linear combinations of the

sample eigenvectors.

Pro: computationally simpler than the MLM

Con: and statistically more efficient than MLM

The author regards MODE-2 method as the significant new result in the field (1990), and introduces multiple advantages over other methods (although MODE-1 and MODE-2 are close in terms of computations).

Comment: For our use now, we are only looking at a single source beacon. Although there would be interferences from neighboring competition spots, but we believe those interference can be filtered out before we start direction-of-arrival analysis. Even if we were to introduce multiple other noises or interferences, the apparent goal would be to filter out irrelevant signals, rather than recognizing all of them. Therefore, the methods are not useful to us for now.

That being said, if we later move on into exploring underwater localization using multiple source beacons, where we will try and separate the signals, the methods introduced in this paper will be of great value.