ATVS logo Biometric Recognition Group - ATVS EPS logo UAM logo

INSTRUCTIONS FOR DOWNLOADING Ahumada_25

  1. Download license agreement, send by email one signed and scanned copy to joaquin.gonzalezuam.es (javier.hernandezouam.es in cc) according to the instructions given in point 2.
     

  2. Send an email to joaquin.gonzalezuam.es (javier.hernandezouam.es in cc), as follows:
    Subject: [DATABASE download: Ahumada_25]

    Body
    : Your name, e-mail, telephone number, organization, postal mail, purpose for which you will use the database, time and date at which you sent the email with the signed license agreement.
     

  3. Once the email copy of the license agreement has been received at ATVS, you will receive an email with a username, a password, and a time slot to download the database.
     

  4. Download the database, for which you will need to provide the authentication information given in step 4. After you finish the download, please notify by email to joaquin.gonzalezuam.es that you have successfully completed the transaction.
     

  5. For more information, please contact: joaquin.gonzalezuam.es



DESCRIPTION OF Ahumada_25

The Ahumada_25 database is a subset of the Ahumada database, including the first 25 speakers. The speech files follow these characteristics:

  • The sampling rate is 16.000 Hz for files "M1", "M2", "M3", "M4", "M5" and "M6" (microphone sessions).

  • The sampling rate is 8.000 Hz for files "T1", "T2", "T3" and "M7" (telephone sessions).

  • All files are 16 bits binary.

  • Each file has a 20 byte header, specific of the segmentation software used and independent of the speech samples.

Files 006M6A00.wav y 022T1C10.WAV are damaged.



FILES NOMENCLATURE

File names will follow this format:

  • The three first characters correspond to the speaker number.

  • Characters fourth and fifth indicate:

    • "T1".- Telephone conversation, 1st session.

    • "T2".- Telephone conversation, 2nd session.

    • "T3".- Telephone-Microphone, 3rd session.

    • "M1".- Microphone, 1st session, SONY ECM-66B lapel microphone.

    • "M2".- Microphone, 2nd session, SONY ECM-66B lapel microphone.

    • "M3".- Microphone, 3rd session, SONY ECM-66B lapel microphone.

    • "M4".- Microphone, 1st session, AKH D80S desktop microphone.

    • "M5".- Microphone, 2nd session, AKG C410-B head-mounted microphone.

    • "M6".- Microphone, 3rd session, TARGET lapel microphone.

    • "M7".- Telephone-Microphone, 3rd session, SONY ECM-66B lapel microphone.

  • The sixth character corresponds to the task:

    • "A".- Isolated numbers, common to all speakers.

    • "B".- Number strings.

    • "C".- Sentences, common to all speakers.

    • "D".- Text, common to all speakers.

    • "E".- Specific text, common to all speakers.

    • "F".- Spontaneous speech.

  • The last two charactes (seventh and eitgh) determine the sub-task:

    • "A00".- Isolated number, common to all speakers. Simple task.

    • "B01 ... B10".- First to last numeric strings.

    • "C01 ... C10".- First to last sentence.

    • "D01".- Text, common to all speakers. Normal rate (sessions M and T).

    • "D02".- Text, common to all speakers. Slow rate (session M).

    • "D03".- Text, common to all speakers. Fast rate (session M).

    • "E00".- Specific text, common to all speakers. Simple task.

    • "F00".- Spontaneous speech. Simple task.



REFERENCES

For further information on the database, we refer the reader to the following article:

  • [SC2000] Ortega García, J., González Rodríguez, J., Marrero-Aguiar, V., "AHUMADA: A large speech corpus in Spanish for speaker characterization and identification", Speech Communication, vol. 31, pp. 255-264, June 2000.

Please remember to reference article [SC2000] on any work made public, whatever the form, based directly or indirectly on any part of the Ahumada_25 database.