Přehled o publikaci
2018
Modifying Hamming Spaces for Efficient Search
MÍČ, Vladimír; David NOVÁK a Pavel ZEZULAZákladní údaje
Originální název
Modifying Hamming Spaces for Efficient Search
Autoři
MÍČ, Vladimír; David NOVÁK a Pavel ZEZULA
Vydání
USA, 18th International Conference on Data Mining Workshops (ICDMW), Singapore, November 17-21, 2018, od s. 945-953, 9 s. 2018
Nakladatel
IEEE
Další údaje
Jazyk
angličtina
Typ výsledku
Stať ve sborníku
Stát vydavatele
Spojené státy
Utajení
není předmětem státního či obchodního tajemství
Forma vydání
tištěná verze "print"
Označené pro přenos do RIV
Ano
Kód RIV
RIV/00216224:14330/18:00103668
Organizace
Fakulta informatiky – Masarykova univerzita – Repozitář
ISBN
978-1-5386-9288-2
ISSN
UT WoS
EID Scopus
Klíčová slova anglicky
Similarity search;Hamming space;Hamming Weight Tree; Lower bound inthe Hamming space
Návaznosti
EF16_019/0000822, projekt VaV.
Změněno: 6. 9. 2020 07:04, RNDr. Daniel Jakubík
Anotace
V originále
We focus on the efficient search for the most similar bit strings to a given query in the Hamming space. The distance of this space can be lower-bounded by a function based on a difference of the number of ones in the compared strings, i.e. their weights. Recently, such property has been successfully used by the Hamming Weight Tree (HWT) indexing structure. We propose modifications of the bit strings that preserve pairwise Hamming distances but improve the tightness of these lower bounds, so the query evaluation with the HWT is several times faster. We also show that the unbalanced bit strings, recently reported to provide similar quality of search as the traditionally used balanced bit strings, are more easy to index with the HWT. Combined with the distance preserving modifications, the HWT query evaluation can be more than one order of magnitude faster than the HWT baseline. Finally, we show that such modifications are useful even for a very complex data where the search with the HWT is slower than a sequential search.