Fundamental Theory

Initial Setup

In Figure 1 below, we have four hydrophones spaced 18cm apart along with four independent USB sound cards. Our beacon is a JW Fishers MFP-1 Pinger with a variable output range of 20kHz - 40kHz and outputs at 0.5, 1, and 2 second intervals. Due to there being a hardware filter built into the sound cards which acted as a low-pass filter, we had to use the lower limit of 20kHz in our pinger for this project. We used Audacity to collect the audio date and ran it on Mac OSX. In Figure 2, we demonstrate an example of how we positioned our beacon and hydrophone array.

Figure 1: Hydrophone Array and Beacon

Figure 2: Hydrophone Array and Beacon in action

FIR Filtering

As with all real world data, our data is riddled with noise. As a result, the first step in the process of cleaning our data is filtering for the predetermined frequency of the pulse. At competition, we will be told the designated pulse to listen for. For our use case we are evaluating our model with a desired pulse set at 20kHz. In order to establish a first step in cleaning our data we created a band pass filter centered at 20kHz and provided a tolerance of 500 Hz on either side. While the USB sound cards we were using had a strict low pass filter at 22kHz, we felt it necessary to make our system portable to more robust system and institute a full band pass filter.

Pre-filtering

Figure 3: Pre-Filtered Signal

Post-filtering

Figure 4: Post-Filtered Signal

Our resultant data can be modeled using a spectrogram and shows the result of our filter. From this point on, all noise and errors in our data will occur in the same frequency range as our desired pulse.

After we filtered our data, we next want to normalize all of our channels. When normalizing, first we calculate an offset by taking the average of the audio signals. We then subtract the audio signal by this offset so that we have a zero-offset signal created. It’s important because if we don’t perform this operation, we may have an additional term in cross-correlation and may not be in alignment.

Split the Calibration

and Measurement Waveforms

The next step in our process was to designate calibration and testing data. At the beginning of each recording, we perform a calibration by bringing the beacon close to each hydrophone for several pulses. Then, we continue the recording and deployed the beacon a predetermined distance away from the hydrophone array for the remainder of the recording. While re-calibrating may not be necessary in future tests, error checking on every recording allows for the best prediction of the success of our DSP pipeline.

Prior to performing any significant data analysis, it’s necessary to separate our calibration waveforms from our measurement waveforms. For one, these two segments differ dramatically in magnitude. Observe in Figure 4 how roughly before 20 seconds (our calibration period), the intensity is far greater than the rest of the duration. As we will cover in pulse extraction, our system relies on a system of thresholding data to determine valuable data as opposed to built in noise. Once we separate our two waveforms, we only extract pulses that are 20 times the average absolute value and within our window. Thus, if we did not manually separate the original calibration data and our measurement data, a majority, if not all, of our measurement data would be below the threshold.

Pulse Extraction

The next step in our process is to extract the significant data from our recorded pulses. The first segment of a recording was calibration data in order to correlate potential delays in hardware and variations in our testing array. The second set of data was our measurement data. For the recorded data below, our testing array is setup in line with the beacon located to the left of the left-most hydrophone. Our data should show that the pulse arrives at the left beacon, then left-middle, right-middle and right-most beacon. However, due to hardware limitations of abstracting, in software, one audio interface from 4 independent USB sound cards, there is a hardware based offset in the recorded signal. As a result, we have to correct for this error.

Figure 5: Example of a Pulse

Our first step in pulse extraction is to first set our baseline threshold. By averaging out the amplitudes of the entire section of our measurement data, we create a threshold at 20 times that averaged value. We determined that any value that was more than 20 times the average would be deemed a potential pulse, and anything below as noise. In the plot above, if a value satisfied this baseline threshold we plotted it above.

Figure 6: Signals that satisfy our threshold

Our next step was to create a window of interest for every source pulse. We determined that as a baseline, the maximum components of a source pulse would be examined to occur in a 0.7ms window. This window would encompass the total time that the beginning of the pulse above the threshold would hit the first microphone, to the falling edge of the source pulse below the threshold on the last hydrophone. Then, using the pulses above the threshold, and within the that 0.7ms window on either side, would become our signals of interest. Essentially, we convolve this 0.7ms window with each of the pulses above the threshold as shown in Figure 2. The reason for convolving is so that we can find the start and end of the pulses. Afterwards, we shift back slightly to retain the small features of the signal that occured before the pulse commences. We then take the logical union of all the channel pulses to determine the overall beginning and end time, which we will use to extract all the pulses as shown in figure 3. To identify the false pulses, we have to check if all four pulses are inside the union. If they are not, they are removed from our set.

Figure 7: Threshold segments shifted and connected

Cross-Correlation

The last of the data cleaning occurs in cross-correlation as shown in Figure 8 below. Cross correlation is a DSP concept that is centered around the premise of detecting the same pulse within the four individual hydrophone recordings. However, because we dealing with digital signals, this can only be one component of determining the potential time delay across the entire hydrophone array.

Cross-Correlation is contingent on comparing the similarities across all four signals. The most significant of these similarities is the maximum of the cross correlation function. At this point the signals are best aligned and act as the most significant measure of the time delay. However, because of our sampling rate for the four signals is limited to 48kHz, there may exist scenarios where a cross correlation is limited in its accuracy.

Some signals cannot be aligned if wave measurements for two hydrophones are taken out of phase. In order to correct for this, we utilize a Lissajous figure.

Figure 8: Cross-Correlation

Lissajous Figure

Lissajous is an abstract mathematical concept we will use as a sanity check but also as a way to determine the phase difference between our hydrophones. To understand the Lissajous figure we plot two sinusoids as individual x and y components of a figure. If the sinusoids are in phase and of the same frequency, the resultant frequency is a line at going through y=x.

However, if the signals are of the same frequency, but out of phase, the resultant figure will form an ellipse. In our data we use the ratio between the major and minor axis of the ellipse to determine the phase offset of two signals. If the ratio is very high and the figure more resembles the ideal y=x line, our data will need a smaller offset. However, if the ellipse resembles a circle, or the line y=-x, the phase offset will resemble 90° or 180° respectively.

From this phase offset we are able to increase the fidelity of the incoming signal. For instance, if the Lissajous figure was offset by 90°, the additional offset would be the period of one 20Khz wave multiplied by one over 4. Or, for our purposes, 1.25*10^-5 seconds of offset. In our ultimate setup, this would be the same as a potential difference in distance of an additional 10% distance in hydrophones. Data that would otherwise be lost, but Lissajous figures, can be gained from the same source data.

Time of Arrival (ToA) using Hyperbolic Surfaces

The final component of our analysis involves the Time of Arrival analysis itself. Our first step in this is to find the relationship between two pairs of hydrophones. For this, we use hydrophones 1 and 3 as well as hydrophones 2 and 4. By measuring the time delay with cross correlation and Lissajous figure offsets, we end up with two sets of hyperbolic surfaces.

These two surfaces, when overlaid, create a circle encompassing the x-axis in the y-z plane. For the simplicity of our use case we assume that source pulse is in the same xy plane of our hydrophones. Thus, the overlap of the hyperbolic surfaces and the xy plane are two points. These points are Located opposite of sides of the array of hydrophones, equidistant from the x axis. We are able to break the tie by assuming that the source is in front of us.

However, there are still limitations of this model. For one, there exists a scenario where the hyperbolic surface cannot overlap because the source pulse exists on the x-axis. Moving forward, we may need to consider running a separate model on data deemed as a non-real result.

Figure 9: Lissajous Reference Guide

Figure 10: Lissajous Figure of Recorded Data

Figure 11: Lissajous Figure of Offset Data

Figure 12: Hyperbolic Surface from One Pair of Hydrophones

Time of Arrival (ToA) using Hyperbolic Surfaces

The final component of our analysis involves the Time of Arrival analysis itself. Our first step in this is to find the relationship between two pairs of hydrophones. For this, we use hydrophones 1 and 3 as well as hydrophones 2 and 4. By measuring the time delay with cross correlation and Lissajous figure offsets, we end up with two sets of hyperbolic surface.

These two surfaces, when overlaid, create a circle encompassing the x-axis in the y-z plane. For the simplicity of our use case we assume that source pulse is in the same xy plane of our hydrophones. Thus, the overlap of the hyperbolic surfaces and the xy plane are two points. These points are Located opposite of sides of the array of hydrophones, equidistant from the x axis. We are able to break the tie by assuming that the source is in front of us.

However, there are still limitations of this model. For one, there exists a scenario where the hyperbolic surface cannot overlap because the source pulse exists on the x-axis. Moving forward, we may need to consider running a separate model on data deemed as a non-real result.

Figure 12: Hyperbolic Surface from One Pair of Hydrophones

Figure 13: Overlaid Hyperbolic Surfaces from Two Pairs of Hydrophones

Figures 14 & 15: Resultant Angle measures for Test Set 3 of Hydrophone Pairs

When we see red blips, these are measured delays that are more than the hardware setup could possible induce in reality (no solution to the equations). This happens particularly when the beacon is in line with the sensors. Whenever we see error or fluctuating angles, we can attribute to a host of reasons. These include the following:

1. Insufficient (and inaccurate) calibration data

2. Non-static offsets among channels.

2.1 Parasitic capacitances / resistances changes when we rotate our beam, which results in varying measurement offsets.

2.2 Clock skew and non-uniform sampling rate in the ADC chip.

3. Other unaccounted behaviors and errors.

General Trend of Figure 15:

The angle calculations occur after calibration, once the beacon has been thrown into the water. Each second after the 0:26 in video 3 corresponds to a measurement point in the graph above. Initially for the first few seconds the hydrophone array is not steady so we’re mismeasuring. Then we hold it steady approximately at the normal line for a few seconds and start rotating clockwise at 0:35. At 0:45 in video 3, it will correspond to 20 delay measurements in the graph above. There are a few misaligned pulses that we can possibly attribute to errors in cross-correlation. At 0:48, we start to rotate back counter clockwise. Hence from 25 to 35 delay measurements, the trend of moving the angle back to positive is occurring. From 1:00 all the way until 1:06, the beacon is nearly inline with our sensors resulting in unsolvable solutions as shown between 35 to 40 delay measurements. From 1:06 until 1:15, we’ve rotated even more to face our sensors backwards, resulting in positive angles from 40 to 45. The algorithm recognizes signals coming from behind as if it were from a mirrored direction at the front. This is why we’re capturing positive angles during that period. From 1:15 to 1:20, we’re nearly inline with the beacon and the sensor so we’re unable to get any solutions due to the limitation of our accuracy. This corresponds to angles at 45 to 50 delay measurements. We then rotate back from 1:20 to 1:30 in the video, so we see decreasing positive angles at delay measurements 55 to 60, which is expected. We have a similar, but slightly different trend that occurs in Figure 14.

Video 3: April 16th Testing

Ideal Data

Our current testing setup was limited by our 48kHz sampling rate. By creating simulated data in MATLAB, we can model the potential capabilities of our system. We can set samples rates that were unattainable in real-life due to our hardware constraints, such at 96kHz or 192kHz. In Fig 16, the four red dots correspond to the hydrophones in our sensor array. The blue dot corresponds to our beacon and can be set arbitrarily within the xy plane. Based off the position of the blue dot, we generate ideal signals with appropriate time delays for each of the hydrophones. The speed of sound in water is assumed to be 1420m/s at 7 degrees celsius. We also add additive white gaussian noise (AWGN) to our signals to simulate the noise that would’ve been added to our signals in real-life. Using the same algorithm that we processed our real life data, we generate the two plots below. They remain horizontal, which is expected since neither the sound source nor the hydrophone array is moving. Moreover, the angles between sensors 1 and 3 as well as 2 and 4 are in agreement with Figures 17 and 18.

Figure 16: Positioning of Sound Source

Figures 17 & 18: Resultant Angle measures for Ideal Data

References:

Al-Khazali, Hisham A. H.; Askari, Mohamad R. (May 2012). "Geometrical and Graphical Representations Analysis of Lissajous Figures in Rotor Dynamic System"

Blosser, Brian; Meshulam, Matt (May 2008) "Implementing a Microphone Array on a Mobile Robotic Device "

Ridruejo, et al. “An Optimized Method Based on Digitalized Lissajous Curve to Determine Lifetime of Luminescent Materials on Optical Fiber Sensors.” Journal of Sensors, Hindawi, 12 June 2016, www.hindawi.com/journals/js/2016/6019439/.