Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories

Fast Fourier Transform for .wav files

My objective is to make a speaker recognition program using Delphi.

So far, I have managed to record my voice through a microphone into a .wav file. However, I get confused about how to process the .wav file using FFT to get the frequency domain. I have read several papers about FFT, yet they are too theoritical and when I find it difficult to put it into Delphi codes.

Any help will be greatly appreciated.

Comments

  • sziszi81sziszi81 Member Posts: 80
    : My objective is to make a speaker recognition program using Delphi.
    :
    : So far, I have managed to record my voice through a microphone into a .wav file. However, I get confused about how to process the .wav file using FFT to get the frequency domain. I have read several papers about FFT, yet they are too theoritical and when I find it difficult to put it into Delphi codes.
    :
    : Any help will be greatly appreciated.
    :


    The .wav file is a data stream, made up of small "objects" (atomic sounds). These objects have parameters like frequency, amplitude. You have to read these records one by one and analyze them. Every sound (letter) you pronounce has a pattern (a small stream of these objects). When you analyze a recorded sound file you have to find these patterns in the sound-stream.

    Can you obtain the parameters out of the parts of the stream?






  • nospacebarnospacebar Member Posts: 7
    My problem now lies on how to read/analyze a recorded sound file. If it was a graphic files [.bmp or .jpg], I could read the file using TImage.Canvas.Pixels to obtain each pixels of the file and then use them for another process. Since this is a sound file, it is quite different and I don't know any function/way to "read" the file.

    I hope my poor explanation don't confuse you.

    Thanks for your kind reply.


    : : My objective is to make a speaker recognition program using Delphi.
    : :
    : : So far, I have managed to record my voice through a microphone into a .wav file. However, I get confused about how to process the .wav file using FFT to get the frequency domain. I have read several papers about FFT, yet they are too theoritical and when I find it difficult to put it into Delphi codes.
    : :
    : : Any help will be greatly appreciated.
    : :
    :
    :
    : The .wav file is a data stream, made up of small "objects" (atomic sounds). These objects have parameters like frequency, amplitude. You have to read these records one by one and analyze them. Every sound (letter) you pronounce has a pattern (a small stream of these objects). When you analyze a recorded sound file you have to find these patterns in the sound-stream.
    :
    : Can you obtain the parameters out of the parts of the stream?
    :
    :
    :
    :
    :
    :
    :

  • zibadianzibadian Member Posts: 6,349
    : My problem now lies on how to read/analyze a recorded sound file. If it was a graphic files [.bmp or .jpg], I could read the file using TImage.Canvas.Pixels to obtain each pixels of the file and then use them for another process. Since this is a sound file, it is quite different and I don't know any function/way to "read" the file.
    :
    : I hope my poor explanation don't confuse you.
    :
    : Thanks for your kind reply.
    :
    :
    : : : My objective is to make a speaker recognition program using Delphi.
    : : :
    : : : So far, I have managed to record my voice through a microphone into a .wav file. However, I get confused about how to process the .wav file using FFT to get the frequency domain. I have read several papers about FFT, yet they are too theoritical and when I find it difficult to put it into Delphi codes.
    : : :
    : : : Any help will be greatly appreciated.
    : : :
    : :
    : :
    : : The .wav file is a data stream, made up of small "objects" (atomic sounds). These objects have parameters like frequency, amplitude. You have to read these records one by one and analyze them. Every sound (letter) you pronounce has a pattern (a small stream of these objects). When you analyze a recorded sound file you have to find these patterns in the sound-stream.
    : :
    : : Can you obtain the parameters out of the parts of the stream?
    : :
    : :
    : :
    : :
    : :
    : :
    : :
    :
    :
    You could open the file using a TFileStream(). That will give you access to all the bytes of the file. Then you could write code to read the sound parameters from the file using the wav-format description.
  • nospacebarnospacebar Member Posts: 7
    : : My problem now lies on how to read/analyze a recorded sound file. If it was a graphic files [.bmp or .jpg], I could read the file using TImage.Canvas.Pixels to obtain each pixels of the file and then use them for another process. Since this is a sound file, it is quite different and I don't know any function/way to "read" the file.
    : :
    : : I hope my poor explanation don't confuse you.
    : :
    : : Thanks for your kind reply.
    : :
    : :
    : : : : My objective is to make a speaker recognition program using Delphi.
    : : : :
    : : : : So far, I have managed to record my voice through a microphone into a .wav file. However, I get confused about how to process the .wav file using FFT to get the frequency domain. I have read several papers about FFT, yet they are too theoritical and when I find it difficult to put it into Delphi codes.
    : : : :
    : : : : Any help will be greatly appreciated.
    : : : :
    : : :
    : : :
    : : : The .wav file is a data stream, made up of small "objects" (atomic sounds). These objects have parameters like frequency, amplitude. You have to read these records one by one and analyze them. Every sound (letter) you pronounce has a pattern (a small stream of these objects). When you analyze a recorded sound file you have to find these patterns in the sound-stream.
    : : :
    : : : Can you obtain the parameters out of the parts of the stream?
    : : :
    : : :
    : : :
    : : :
    : : :
    : : :
    : : :
    : :
    : :
    : You could open the file using a TFileStream(). That will give you access to all the bytes of the file. Then you could write code to read the sound parameters from the file using the wav-format description.
    :

    If you don't mind, will you provide me with example how to do it with a wav-format file? I'd really appreciate that.

    Thanks in advance.

  • zibadianzibadian Member Posts: 6,349
    : : : My problem now lies on how to read/analyze a recorded sound file. If it was a graphic files [.bmp or .jpg], I could read the file using TImage.Canvas.Pixels to obtain each pixels of the file and then use them for another process. Since this is a sound file, it is quite different and I don't know any function/way to "read" the file.
    : : :
    : : : I hope my poor explanation don't confuse you.
    : : :
    : : : Thanks for your kind reply.
    : : :
    : : :
    : : : : : My objective is to make a speaker recognition program using Delphi.
    : : : : :
    : : : : : So far, I have managed to record my voice through a microphone into a .wav file. However, I get confused about how to process the .wav file using FFT to get the frequency domain. I have read several papers about FFT, yet they are too theoritical and when I find it difficult to put it into Delphi codes.
    : : : : :
    : : : : : Any help will be greatly appreciated.
    : : : : :
    : : : :
    : : : :
    : : : : The .wav file is a data stream, made up of small "objects" (atomic sounds). These objects have parameters like frequency, amplitude. You have to read these records one by one and analyze them. Every sound (letter) you pronounce has a pattern (a small stream of these objects). When you analyze a recorded sound file you have to find these patterns in the sound-stream.
    : : : :
    : : : : Can you obtain the parameters out of the parts of the stream?
    : : : :
    : : : :
    : : : :
    : : : :
    : : : :
    : : : :
    : : : :
    : : :
    : : :
    : : You could open the file using a TFileStream(). That will give you access to all the bytes of the file. Then you could write code to read the sound parameters from the file using the wav-format description.
    : :
    :
    : If you don't mind, will you provide me with example how to do it with a wav-format file? I'd really appreciate that.
    :
    : Thanks in advance.
    :
    :
    I don't know the wave format, so this example is not for a wave format, but it shows how to read data from a stream. I do know that a wave file has a header to identify it, and then consists of block of data. This example does show how to read that, but the header and data-blocks are based on arrays and nothing is done with them.
    [code]
    var
    Stream: TStream;
    header: array[0..255] of byte;
    a: array[0..1023] of byte;
    begin
    Stream := TFileStream.Create(Filename, fmOpenRead);
    try
    Stream.Read(header, SizeOf(header));
    while Stream.Position < Stream.Size do
    Stream.Read(a, SizeOf(a));
    finally
    Stream.Free;
    end;
    end;
    [/code]
    You can call other procedures/functions to handle the data from the stream, like you would any other data.
  • nospacebarnospacebar Member Posts: 7
    : : : : My problem now lies on how to read/analyze a recorded sound file. If it was a graphic files [.bmp or .jpg], I could read the file using TImage.Canvas.Pixels to obtain each pixels of the file and then use them for another process. Since this is a sound file, it is quite different and I don't know any function/way to "read" the file.
    : : : :
    : : : : I hope my poor explanation don't confuse you.
    : : : :
    : : : : Thanks for your kind reply.
    : : : :
    : : : :
    : : : : : : My objective is to make a speaker recognition program using Delphi.
    : : : : : :
    : : : : : : So far, I have managed to record my voice through a microphone into a .wav file. However, I get confused about how to process the .wav file using FFT to get the frequency domain. I have read several papers about FFT, yet they are too theoritical and when I find it difficult to put it into Delphi codes.
    : : : : : :
    : : : : : : Any help will be greatly appreciated.
    : : : : : :
    : : : : :
    : : : : :
    : : : : : The .wav file is a data stream, made up of small "objects" (atomic sounds). These objects have parameters like frequency, amplitude. You have to read these records one by one and analyze them. Every sound (letter) you pronounce has a pattern (a small stream of these objects). When you analyze a recorded sound file you have to find these patterns in the sound-stream.
    : : : : :
    : : : : : Can you obtain the parameters out of the parts of the stream?
    : : : : :
    : : : : :
    : : : : :
    : : : : :
    : : : : :
    : : : : :
    : : : : :
    : : : :
    : : : :
    : : : You could open the file using a TFileStream(). That will give you access to all the bytes of the file. Then you could write code to read the sound parameters from the file using the wav-format description.
    : : :
    : :
    : : If you don't mind, will you provide me with example how to do it with a wav-format file? I'd really appreciate that.
    : :
    : : Thanks in advance.
    : :
    : :
    : I don't know the wave format, so this example is not for a wave format, but it shows how to read data from a stream. I do know that a wave file has a header to identify it, and then consists of block of data. This example does show how to read that, but the header and data-blocks are based on arrays and nothing is done with them.
    : [code]
    : var
    : Stream: TStream;
    : header: array[0..255] of byte;
    : a: array[0..1023] of byte;
    : begin
    : Stream := TFileStream.Create(Filename, fmOpenRead);
    : try
    : Stream.Read(header, SizeOf(header));
    : while Stream.Position < Stream.Size do
    : Stream.Read(a, SizeOf(a));
    : finally
    : Stream.Free;
    : end;
    : end;
    : [/code]
    : You can call other procedures/functions to handle the data from the stream, like you would any other data.
    :

    Thanks for your quick reply :)
    Let's say if I want to analyze the header or part of the stream, can I write it in a file or put it on the screen? If so, what method should I use?
  • zibadianzibadian Member Posts: 6,349
    : : : : : My problem now lies on how to read/analyze a recorded sound file. If it was a graphic files [.bmp or .jpg], I could read the file using TImage.Canvas.Pixels to obtain each pixels of the file and then use them for another process. Since this is a sound file, it is quite different and I don't know any function/way to "read" the file.
    : : : : :
    : : : : : I hope my poor explanation don't confuse you.
    : : : : :
    : : : : : Thanks for your kind reply.
    : : : : :
    : : : : :
    : : : : : : : My objective is to make a speaker recognition program using Delphi.
    : : : : : : :
    : : : : : : : So far, I have managed to record my voice through a microphone into a .wav file. However, I get confused about how to process the .wav file using FFT to get the frequency domain. I have read several papers about FFT, yet they are too theoritical and when I find it difficult to put it into Delphi codes.
    : : : : : : :
    : : : : : : : Any help will be greatly appreciated.
    : : : : : : :
    : : : : : :
    : : : : : :
    : : : : : : The .wav file is a data stream, made up of small "objects" (atomic sounds). These objects have parameters like frequency, amplitude. You have to read these records one by one and analyze them. Every sound (letter) you pronounce has a pattern (a small stream of these objects). When you analyze a recorded sound file you have to find these patterns in the sound-stream.
    : : : : : :
    : : : : : : Can you obtain the parameters out of the parts of the stream?
    : : : : : :
    : : : : : :
    : : : : : :
    : : : : : :
    : : : : : :
    : : : : : :
    : : : : : :
    : : : : :
    : : : : :
    : : : : You could open the file using a TFileStream(). That will give you access to all the bytes of the file. Then you could write code to read the sound parameters from the file using the wav-format description.
    : : : :
    : : :
    : : : If you don't mind, will you provide me with example how to do it with a wav-format file? I'd really appreciate that.
    : : :
    : : : Thanks in advance.
    : : :
    : : :
    : : I don't know the wave format, so this example is not for a wave format, but it shows how to read data from a stream. I do know that a wave file has a header to identify it, and then consists of block of data. This example does show how to read that, but the header and data-blocks are based on arrays and nothing is done with them.
    : : [code]
    : : var
    : : Stream: TStream;
    : : header: array[0..255] of byte;
    : : a: array[0..1023] of byte;
    : : begin
    : : Stream := TFileStream.Create(Filename, fmOpenRead);
    : : try
    : : Stream.Read(header, SizeOf(header));
    : : while Stream.Position < Stream.Size do
    : : Stream.Read(a, SizeOf(a));
    : : finally
    : : Stream.Free;
    : : end;
    : : end;
    : : [/code]
    : : You can call other procedures/functions to handle the data from the stream, like you would any other data.
    : :
    :
    : Thanks for your quick reply :)
    : Let's say if I want to analyze the header or part of the stream, can I write it in a file or put it on the screen? If so, what method should I use?
    :
    Here's how:
    1: Create a stream object
    2: Use SetLength() to initialize a string with the length of the header
    3: Call Stream.Read() with that string:
    [code]
    Stream.Read(MyString[1], Length(MyString));
    [/code]
    It is important to include the [1], because strings are quite complex in Delphi, and their memory doesn't start at the first character.
    4: Assign the string to the caption/text of a control, such as a TLabel or a TEdit
    5: Free the stream object again.
    This will show the header in its totality. If you want to show the individual header fields, use a record for the header and read that using SizeOf(). Then format the output as you want to show it.
  • nospacebarnospacebar Member Posts: 7
    : :
    : : Thanks for your quick reply :)
    : : Let's say if I want to analyze the header or part of the stream, can I write it in a file or put it on the screen? If so, what method should I use?
    : :
    : Here's how:
    : 1: Create a stream object
    : 2: Use SetLength() to initialize a string with the length of the header
    : 3: Call Stream.Read() with that string:
    : [code]
    : Stream.Read(MyString[1], Length(MyString));
    : [/code]
    : It is important to include the [1], because strings are quite complex in Delphi, and their memory doesn't start at the first character.
    : 4: Assign the string to the caption/text of a control, such as a TLabel or a TEdit
    : 5: Free the stream object again.
    : This will show the header in its totality. If you want to show the individual header fields, use a record for the header and read that using SizeOf(). Then format the output as you want to show it.
    :

    Then could you tell me how to read each byte of the stream data? Let's say I'd like to know the value of n-th byte. What method should I use?

    Thanks for your great help :)
  • zibadianzibadian Member Posts: 6,349
    : : :
    : : : Thanks for your quick reply :)
    : : : Let's say if I want to analyze the header or part of the stream, can I write it in a file or put it on the screen? If so, what method should I use?
    : : :
    : : Here's how:
    : : 1: Create a stream object
    : : 2: Use SetLength() to initialize a string with the length of the header
    : : 3: Call Stream.Read() with that string:
    : : [code]
    : : Stream.Read(MyString[1], Length(MyString));
    : : [/code]
    : : It is important to include the [1], because strings are quite complex in Delphi, and their memory doesn't start at the first character.
    : : 4: Assign the string to the caption/text of a control, such as a TLabel or a TEdit
    : : 5: Free the stream object again.
    : : This will show the header in its totality. If you want to show the individual header fields, use a record for the header and read that using SizeOf(). Then format the output as you want to show it.
    : :
    :
    : Then could you tell me how to read each byte of the stream data? Let's say I'd like to know the value of n-th byte. What method should I use?
    :
    : Thanks for your great help :)
    :
    TStream.Seek() and TStream.Read(). See help files for more info.
Sign In or Register to comment.