Subwords

A subword of a word \(w\) is a word obtained by deleting the letters at some (non necessarily adjacent) positions in \(w\). It is not to be confused with the notion of factor where one keeps adjacent positions in \(w\). Sometimes it is useful to allow repeated uses of the same letter of \(w\) in a “generalized” subword. We call this a subword with repetitions.

For example:

  • “bnjr” is a subword of the word “bonjour” but not a factor;
  • “njo” is both a factor and a subword of the word “bonjour”;
  • “nr” is a subword of “bonjour”;
  • “rn” is not a subword of “bonjour”;
  • “nnu” is not a subword of “bonjour”;
  • “nnu” is a subword with repetitions of “bonjour”;

A word can be given either as a string, as a list or as a tuple.

As repetition can occur in the initial word, the subwords of a given words is not a set in general but an enumerated multiset!

Todo

  • implement subwords with repetitions
  • implement the category of EnumeratedMultiset and inheritate from when needed (i.e. the initial word has repeated letters)

AUTHORS:

  • Mike Hansen: initial version
  • Florent Hivert (2009/02/06): doc improvements + new methods + bug fixes
sage.combinat.subword.Subwords(w, k=None, element_constructor=None)

Return the set of subwords of w.

INPUT:

  • w – a word (can be a list, a string, a tuple or a word)
  • k – an optional integer to specify the length of subwords
  • element_constructor – an optional function that will be used to build the subwords

EXAMPLES:

sage: S = Subwords(['a','b','c']); S
Subwords of ['a', 'b', 'c']
sage: S.first()
[]
sage: S.last()
['a', 'b', 'c']
sage: S.list()
[[], ['a'], ['b'], ['c'], ['a', 'b'], ['a', 'c'], ['b', 'c'], ['a', 'b', 'c']]

The same example using string, tuple or a word:

sage: S = Subwords('abc'); S
Subwords of 'abc'
sage: S.list()
['', 'a', 'b', 'c', 'ab', 'ac', 'bc', 'abc']

sage: S = Subwords((1,2,3)); S
Subwords of (1, 2, 3)
sage: S.list()
[(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]

sage: w = Word([1,2,3])
sage: S = Subwords(w); S
Subwords of word: 123
sage: S.list()
[word: , word: 1, word: 2, word: 3, word: 12, word: 13, word: 23, word: 123]

Using word with specified length:

sage: S = Subwords(['a','b','c'], 2); S
Subwords of ['a', 'b', 'c'] of length 2
sage: S.list()
[['a', 'b'], ['a', 'c'], ['b', 'c']]

An example that uses the element_constructor argument:

sage: p = Permutation([3,2,1])
sage: Subwords(p, element_constructor=tuple).list()
[(), (3,), (2,), (1,), (3, 2), (3, 1), (2, 1), (3, 2, 1)]
sage: Subwords(p, 2, element_constructor=tuple).list()
[(3, 2), (3, 1), (2, 1)]
class sage.combinat.subword.Subwords_w(w, element_constructor)

Bases: sage.structure.parent.Parent

Subwords of a given word.

cardinality()

EXAMPLES:

sage: Subwords([1,2,3]).cardinality()
8
first()

EXAMPLES:

sage: Subwords([1,2,3]).first()
[]
sage: Subwords((1,2,3)).first()
()
sage: Subwords('123').first()
''
last()

EXAMPLES:

sage: Subwords([1,2,3]).last()
[1, 2, 3]
sage: Subwords((1,2,3)).last()
(1, 2, 3)
sage: Subwords('123').last()
'123'
random_element()

Return a random subword with uniform law.

EXAMPLES:

sage: S1 = Subwords([1,2,3,2,1,3])
sage: S2 = Subwords([4,6,6,6,7,4,5,5])
sage: for i in xrange(100):
....:   w = S1.random_element()
....:   if w in S2:
....:       assert(w == [])
sage: for i in xrange(100):
....:   w = S2.random_element()
....:   if w in S1:
....:       assert(w == [])
class sage.combinat.subword.Subwords_wk(w, k, element_constructor)

Bases: sage.combinat.subword.Subwords_w

Subwords with fixed length of a given word.

cardinality()

Returns the number of subwords of w of length k.

EXAMPLES:

sage: Subwords([1,2,3], 2).cardinality()
3
first()

EXAMPLES:

sage: Subwords([1,2,3],2).first()
[1, 2]
sage: Subwords([1,2,3],0).first()
[]
sage: Subwords((1,2,3),2).first()
(1, 2)
sage: Subwords((1,2,3),0).first()
()
sage: Subwords('123',2).first()
'12'
sage: Subwords('123',0).first()
''
last()

EXAMPLES:

sage: Subwords([1,2,3],2).last()
[2, 3]
sage: Subwords([1,2,3],0).last()
[]
sage: Subwords((1,2,3),2).last()
(2, 3)
sage: Subwords((1,2,3),0).last()
()
sage: Subwords('123',2).last()
'23'
sage: Subwords('123',0).last()
''

TESTS:

sage: Subwords('123', 0).last()  # trac 10534
''
random_element()

Return a random subword of given length with uniform law.

EXAMPLES:

sage: S1 = Subwords([1,2,3,2,1],3)
sage: S2 = Subwords([4,4,5,5,4,5,4,4],3)
sage: for i in xrange(100):
....:   w = S1.random_element()
....:   if w in S2:
....:       assert(w == [])
sage: for i in xrange(100):
....:   w = S2.random_element()
....:   if w in S1:
....:       assert(w == [])
sage.combinat.subword.smallest_positions(word, subword, pos=0)

Return the smallest positions for which subword appears as a subword of word. If pos is specified, then it returns the positions of the first appearance of subword starting at pos.

If subword is not found in word, then return False.

EXAMPLES:

sage: sage.combinat.subword.smallest_positions([1,2,3,4], [2,4])
[1, 3]
sage: sage.combinat.subword.smallest_positions([1,2,3,4,4], [2,4])
[1, 3]
sage: sage.combinat.subword.smallest_positions([1,2,3,3,4,4], [3,4])
[2, 4]
sage: sage.combinat.subword.smallest_positions([1,2,3,3,4,4], [3,4],2)
[2, 4]
sage: sage.combinat.subword.smallest_positions([1,2,3,3,4,4], [3,4],3)
[3, 4]
sage: sage.combinat.subword.smallest_positions([1,2,3,4], [2,3])
[1, 2]
sage: sage.combinat.subword.smallest_positions([1,2,3,4], [5,5])
False
sage: sage.combinat.subword.smallest_positions([1,3,3,4,5],[3,5])
[1, 4]
sage: sage.combinat.subword.smallest_positions([1,3,3,5,4,5,3,5],[3,5,3])
[1, 3, 6]
sage: sage.combinat.subword.smallest_positions([1,3,3,5,4,5,3,5],[3,5,3],2)
[2, 3, 6]
sage: sage.combinat.subword.smallest_positions([1,2,3,4,3,4,4],[2,3,3,1])
False
sage: sage.combinat.subword.smallest_positions([1,3,3,5,4,5,3,5],[3,5,3],3)
False

TEST:

We check for trac ticket #5534:

sage: w = ["a", "b", "c", "d"]; ww = ["b", "d"]
sage: x = sage.combinat.subword.smallest_positions(w, ww); ww
['b', 'd']

Previous topic

Word paths

Next topic

Lyndon words

This Page