Finding patterns in strings using suffix arrays
Stehouwer, H., & Van Zaanen, M.
Finding patterns in strings using suffix arrays. In M. Ganzha, & M. Paprzycki (Eds.
), Proceedings of the International Multiconference on Computer Science and Information Technology, October 18–20, 2010. Wisła, Poland
(pp. 505-511). IEEE.
Finding regularities in large data sets requires implementations of systems that are efﬁcient in both time and space
requirements. Here, we describe a newly developed system that
exploits the internal structure of the enhanced sufﬁxarray to ﬁnd
signiﬁcant patterns in a large collection of sequences. The system
searches exhaustively for all signiﬁcantly compressing patterns
where patterns may consist of symbols and skips or wildcards.
We demonstrate a possible application of the system by detecting
interesting patterns in a Dutch and an English corpus.