DESCRIPTION OF BiosecurID-SONOF DB
The BiosecurID-SONOF DB contains two complementary datasets:
Real dataset: containing on-line and off-line versions of the exact same signatures.
Synthetic dataset: containing off-line signatures generated according to the method described in [PR2015] based on the dynamic signatures of the real database.
Some examples of the data that can be found in the database are shown in Fig. 3.
The real dataset is a subcorpus of the signature data contained in the BiosecurID multimodal database. BiosecurID was acquired in five different Spanish universities and comprises eight different biometric traits captured in four sessions over a six month time span [PAA2010]. The subcorpus included in the BiosecurID-SONOF Real dataset is the signature data corresponding to the 132 users acquired at the Universidad Autonoma de Madrid.
The BiosecurID-Signature UAM subcorpus comprises 132 users, with 16 genuine signatures (four per session) and 12 skilled forgeries (three per session) for every subject. Hence, the database contains the on-line and off-line data of 16x132=2,112 genuine signatures and of 12x132=1,584 skilled forgeries.
Handwritten signatures were acquired with the Intuos3 A4/Inking pen tablet placing a predefined paper template over the digitizing device as shown in Fig 1. The users were told to sign inside a delimited grid in order to reduce the rotation and size variations (25 mm x 120 mm). Signatures were performed on the marked area with a special inking pen which also captured the x and y trajectories and the pen pressure during the signing process, with a sampling frequency of 100 Hz. This way, both versions, dynamic and static, of the same samples were captured simultaneously. In order to obtain the final off-line digitized samples, the grid-templates used to capture the static signatures were scanned at 600 dpi into png grey level files, which were then processed to automatically segment the signature images, stored with the same codename as their on-line versions.
Consequently, the database contains the off-line (on paper) and on-line versions of the exact same real signatures. Genuine and skilled forgery real samples of the same user are shown in the first two rows of Fig. 3, where both the dynamic and static versions of the same signatures are depicted.
Fig. 1: Diagram of the BiosecurID DB acquisition process. Users signed on a paper template that limited the scaling and rotation variability, placed over a digitizing tablet. This way, the on-line and off-line versions of the same signature were acquired simultaneously (figure extracted from [PR2015]).
The synthetic off-line data was generated taking as input the on-line real signatures of the BiosecurID-SONOF Real Dataset. That is, for each real on-line signature in the BiosecurID-SONOF Real Dataset (genuine or skilled forgery), its off-line synthetic version is produced following the methodology described in [PR2015]. Therefore, the synthetic off-line dataset presents exactly the same structure as the real version, that is: 4 sessions, 132 users, 4 genuine signatures and 3 skilled forgeries per session and user.
As described in [PR2015] and shown in Fig. 2, for each real on-line signature, two different synthetic images are produced:
Last row in Fig. 3 shows the synthetic static samples corresponding to the three real signatures depicted in the first two rows. Synthetic signatures are defined by two different images: Ienhanced (third row, top), which incorporates pressure and speed information from the real dynamic signature; and Ipen-ups (third row, bottom), obtained from the signature trajectory during pen-ups.
Fig. 2: Diagram of the enhanced off-line signature generation approach described in [PR2015].
Fig. 3: Real on-line, real off-line and synthetic off-line versions of typical signature examples that can be found in the BiosecurID-SONOF DB. Two genuine samples (first two columns) and a skilled forgery (last column) of the same user are shown. On-line samples are depicted with their corresponding time functions (x and y trajectories and pressure function p). Synthetic samples were generated following the method described in [PR2015]. Each synthetic signature is defined by two images: Ienhanced (third row, top) and Ipen-ups (third row, bottom).
FILES FORMAT
On-line signatures are stored in MATLAB .mat files which contain three vectors [x, y, p] each of them corresponding to each of the three time functions defining each signature (i.e., horizontal and vertical coordinates and pressure signal).
Off-line signatures (both real and synthetic) are stored in regular 600dpi image .png grey level files.
NOMENCLATURE
There is a slight difference between the naming applied to on-line and off-line files.
The nomenclature followed to name the on-line signature files is as follows: uXXXX_sYYYY_sgZZZZ
XXXX: is the number of the user [1001, 1002 ... 1132]
YYYY: is the number of the session [0001, 0002, 0003, 0004]
ZZZZ: is the number of the sample [0001, 0002, 0003, 0004, 0005, 0006, 0007]
Signatures [1, 2, 6, 7] of each session are genuine samples.
Signatures [3, 4, 5] of each session are skilled forgeries.
The nomenclature followed to name the on-line signature files is as follows: uXXXX_sYYYY_sgZZZZA
XXXX: is the number of the user [1001, 1002 ... 1132]
YYYY: is the number of the session [0001, 0002, 0003, 0004]
ZZZZ: is the number of the sample [0001, 0002, 0003, 0004, 0005, 0006, 0007]
A: can take the values "g" for genuine signatures (samples [1, 2, 6, 7]) and "f" for skilled forgeries (samples [3, 4, 5]).
REFERENCES
For further information on the database we refer the reader to (all these articles are publicly available in the publications section of the ATVS group webpage.)
-
[PR2015] J. Galbally, M. Diaz-Cabrera, M. A. Ferrer, M. Gomez-Barrero, A. Morales and J. Fierrez, "On-Line Signature Recognition Through the Combination of Real Dynamic Data and Synthetically Generated Static Data", Pattern Recognition, Vol. 48, pp. 2921-2934, September 2015, [DOI]
-
[PAA2010] J. Fierrez, J. Galbally, J. Ortega-Garcia, M. R. Freire, F. Alonso-Fernandez, D. Ramos, D. T. Toledano, J. Gonzalez-Rodriguez, J. A. Siguenza, J. Garrido-Salas, E. Anguiano, G. Gonzalez-de-Rivera, R. Ribalda, M. Faundez-Zanuy, J. A. Ortega, V. Cardeñoso-Payo, A. Viloria, C. E. Vivaracho, Q. I. Moro, J. J. Igarza, J. Sanchez, I. Hernaez, C. Orrite-Uruñuela, F. Martinez-Contreras and J. J. Gracia-Roche, "BiosecurID: A Multimodal Biometric Database", Pattern Analysis and Applications, Vol. 13, n. 2, pp. 235-246, 2010.
Please remember to reference article [PR2015] on any work made public, whatever the form, based directly or indirectly on any part of the BiosecurID-SONOF DB.