Library reference

Base classes

This module defines the base classes: Symbol, Alphabet and Sequence, their attributes and methods.

A symbolic sequence is a list of symbols taken from a finite alphabet of length \(k\)

Internally they are encoded according to integers from \(0\) to \(k-1\) called ivals (for Integer VALueS) and all the computations use this representation. For human readability, there is also a string value (named svals) which is associated with the integer representation when needed.

This module also provides two specific alphabets:

>>> boolean_alphabet
Alphabet(Symbol(0 | False), Symbol(1 | True))
>>> binary_alphabet
Alphabet(Symbol(0 | 0), Symbol(1 | 1))
class scyseq.sequence.Symbol(value)

A symbol (or state) is used to define the state of the system at time \(t\).

__init__(value)

Initialize an instance of Symbol.

Parameters:

value (int or str) – Value associated with the sval property. Must be an integer or a string. It is automatically converted to a string.

Examples:

>>> Symbol(1) # 1 is converted to a string
Symbol(- | 1)
>>> Symbol('One')
Symbol(- | One)

It has two properties: a string value named sval and an integer value named ival which is only attributed once the symbol is inserted in an alphabet.

>>> my_symbol = Symbol('me')
>>> my_symbol.sval
'me'
>>> my_symbol.ival # returns None
property sval

The “string value” of the Symbol (i.e. its “name”) which can be accessed or changed (set) but not deleted (deleter raises exception for explicit behavior).

If the symbol is inserted in an alphabet, the sval should not already exist in the alphabet.

property ival

The “integer value” of the Symbol associate an integer value which can be accessed but neither changed nor deleted

setter and deleter raise exception for explicit behavior.

__eq__(other)

Returns True if self.ival == other.ival and self.sval == other.sval

class scyseq.sequence.Alphabet(symbols)

The set of symbols that can be visited in a Sequence.

An Alphabet behaves like bidirectional dictionaries with restrictions to avoid problems.

__init__(symbols)

Initialize an instance of the Alphabet class

Parameters:

symbols (int or list or tuple) – The objects used to build the alphabet

Alphabets can be created using:

  1. the length (an integer):

>>> Alphabet(3)
Alphabet(Symbol(0 | 0), Symbol(1 | 1), Symbol(2 | 2))
  1. a list or tuple of strings:

>>> Alphabet(['a', 'b', 'c'])
Alphabet(Symbol(0 | a), Symbol(1 | b), Symbol(2 | c))
  1. a list or tuple of symbols:

>>> Alphabet([Symbol('s0'), Symbol('s1'), Symbol('s2')])
Alphabet(Symbol(0 | s0), Symbol(1 | s1), Symbol(2 | s2))

Alphabets can be created with a list of states:

>>> state1 = Symbol('One')
>>> state2 = Symbol('Two')
>>> state3 = Symbol('Three')
>>> alpha = Alphabet([state1, state2, state3])
>>> alpha
Alphabet(Symbol(0 | One), Symbol(1 | Two), Symbol(2 | Three))
>>> print(alpha)
((0 | One), (1 | Two), (2 | Three))
>>> len(alpha)
3
>>> alpha[0]
Symbol(0 | One)

Symbols cannot be changed directly:

>>> alpha[1] = Symbol('Deux')
Traceback (most recent call last):
...
scyseq.exceptions.AlphabetAccessError: 'Alphabet' object does not support item assignment

But their sval can:

>>> alpha[1].sval = 'Deux'
>>> alpha
Alphabet(Symbol(0 | One), Symbol(1 | Deux), Symbol(2 | Three))

Alphabet’s symbols can be changed using a dictionary representation using the rename method.

>>> alpha.rename({0 : 'Uno', 2 : 'Tre'})
>>> alpha
Alphabet(Symbol(0 | Uno), Symbol(1 | Deux), Symbol(2 | Tre))
__eq__(other)

Two alphabets are equal if they have the same length and if their ivals and svals coincide.

>>> alpha_a = Alphabet(['a','b','c'])
>>> alpha_b = Alphabet(['a','b','c'])
>>> alpha_c = Alphabet(3)
>>> alpha_a == alpha_b
True
>>> alpha_a == alpha_c
False
>>> alpha_c.rename({0 : 'a', 1 : 'b', 2 : 'c'})
>>> alpha_a == alpha_c
True
property svals

The tuple of string values in the alphabet

>>> alpha_a = Alphabet(['a','b','c'])
>>> alpha_a.svals
('a', 'b', 'c')
property ivals

The tuple of integer values in the alphabet

>>> alpha_a = Alphabet(['a','b','c'])
>>> alpha_a.ivals
(0, 1, 2)
items()

Returns the pair ival : sval for each symbol.

>>> alpha_a = Alphabet(['a','b','c'])
>>> list(alpha_a.items())
[(0, 'a'), (1, 'b'), (2, 'c')]
>>> dict(alpha_a.items())
{0: 'a', 1: 'b', 2: 'c'}
rename(replacement)

Rename the svals of the alphabet according to a dictionary with integers as keys and stings as values.

Parameters:

replacement (dictionary) – the new correspondence between ivals and svals.

See also:

The implementation in the operations module: rename()

scyseq.sequence.boolean_alphabet instance of Alphabet

A predefined Alphabet with the two symbols (0 | False) and (1 | True) representing boolean values.

scyseq.sequence.binary_alphabet instance of Alphabet

A predefined Alphabet with the two symbols (0 | 0) and (1 | 1) representing binary values.

class scyseq.sequence.Sequence(symbols, alphabet, check=True)

Defines a symbolic sequence coded using integers in \({0, k-1}\) and their methods.

__init__(symbols, alphabet, check=True)

Initializes a Sequence object.

Parameters:
  • symbols (an object that can be coerced into an np.array of integers.) – the sequence of symbols.

  • alphabet – the alphabet which is either a alphabet or the alphabet length

  • check (boolean) – should the validity of the construction be checked

or

Parameters:

s – a sequence object

Exc:

TypeError: when parameter a is not given and s is not a sequence

Raises:

ValueError: when a is neither a dict nor an int ValueError: when s contains negative values

AlphabetError: when s contains values greater or equal to k

DictionaryError: if keys of d are not in \({0, k-1}\)

Returns:

a Sequence object with attribute s, k and d

>>> seqA = Sequence([1, 0, 0, 2, 0, 0, 0, 2, 2, 0], 3)
>>> seqA
Sequence: [1 0 0 2 0 0 0 2 2 0]
Alphabet(Symbol(0 | 0), Symbol(1 | 1), Symbol(2 | 2))
N = 10 ; k = 3
property alphabet

The alphabet property

property k

The length of the alphabet

property ivals

The tuple of integer values

property svals

The tuple of string values

iteritems()

Returns the pairs ival : sval

iterivals()

Returns the iterator over the integer values

itersvals()

Returns the iterator over the string values

__eq__(other)

Return self==value.

rename(replacement)

Rename the svals of a sequence (in fact the svals of the alphabet)

See the implementation in the operations module: rename()

roll(step)

Roll the sequence of step (with periodic boundary conditions).

See the implementation in the operations module: roll()

reverse()

Reverse the sequence.

See the implementation in the operations module: reverse()

shuffle()

Shuffle the sequence.

See the implementation in the operations module: shuffle()

reduce()

Reduce the sequence i.e. delete repeated symbols.

See the implementation in the operations module: reduce()

count(value=None)

Count the number of value

See the implementation in the operations module: count()

frequency(value=None)

Returns the frequency of value

See the implementation in the operations module: frequency()

Operations

Definitions of operations on objects Sequence and Alphabet.

The object’s methods are wrappers to these operations

scyseq.operations.rename(obj, replacement)

To rename in place symbols in an alphabet or a sequence, pass a dictionary with integers as keys and strings as values so that the replacement of ivals and svals are explicit.

Parameters:

replacement (dict) – The dictionary which describes the replacement.

>>> alpha_d = Alphabet(['a','b','c', 'd'])
>>> alpha_d
Alphabet(Symbol(0 | a), Symbol(1 | b), Symbol(2 | c), Symbol(3 | d))
>>> alpha_d.rename({1: 'One', 3: 'Three'})
>>> alpha_d
Alphabet(Symbol(0 | a), Symbol(1 | One), Symbol(2 | c), Symbol(3 | Three))

The replacement variable should be a valid replacement candidate. So exceptions are raised if:

>>> alpha_d.rename(['bad1', 'bad2', 'bad3']) # not a dictionary
Traceback (most recent call last):
...
TypeError: The input must be a dictionary
>>> alpha_d.rename({'bad1': 'bad2', 'bad3': 'bad4'}) # not {int : str}
Traceback (most recent call last):
...
scyseq.exceptions.AlphabetAccessError: Replacements should be {integer : string, ...}
>>> alpha_d.rename({0: 'bad0', 1: 'bad0'}) # values are not different
Traceback (most recent call last):
...
scyseq.exceptions.AlphabetAccessError: Replacement values should all be different.
>>> alpha_d.rename({0: 'Three'}) # values already exists
Traceback (most recent call last):
...
scyseq.exceptions.AlphabetAccessError: Symbol 'Three' already exists in alphabet
scyseq.operations.roll(obj, step)

Roll the sequence

>>> seq = Sequence([0, 1, 2, 0, 1], 3)
>>> roll(seq, 2).ivals.tolist()
[0, 1, 0, 1, 2]
scyseq.operations.reverse(obj)

Reverse the sequence

>>> seq = Sequence([0, 1, 2, 0, 1], 3)
>>> reverse(seq).ivals.tolist()
[1, 0, 2, 1, 0]
scyseq.operations.shuffle(obj)

Shuffle the sequence

>>> seq = Sequence([0, 1, 2, 0, 1], 3)
>>> shuffled = shuffle(seq)
>>> sorted(shuffled.ivals.tolist())
[0, 0, 1, 1, 2]
>>> seq.ivals.tolist()
[0, 1, 2, 0, 1]
scyseq.operations.reduce(obj)

Returns a reduced sequence (ie Delete the repetitions of symbols in a sequence)

>>> seq = Sequence([0, 0, 2, 2, 2, 0, 1, 1], 3)
>>> reduce(seq).ivals.tolist()
[0, 2, 0, 1]
scyseq.operations.count(obj, value=None)

Counts the number of each symbol in \({0, k-1}\) if code is None or the number of the code symbol.

Parameters:
  • obj (Sequence) – The sequence object to count symbols in.

  • value (int or str, optional) – The specific symbol value to count. If None, counts all symbols.

Returns:

An array of counts for each symbol, or a single integer count if value is provided.

Return type:

numpy.ndarray or int

scyseq.operations.frequency(obj, value=None)

Returns the probability of each symbol in \({0, k-1}\).

Parameters:
  • obj (Sequence) – The sequence object.

  • value (int or str, optional) – The specific symbol value to find the probability of. If None, computes probabilities for all symbols.

Returns:

An array of floats representing probabilities, or a single float if value is provided.

Return type:

numpy.ndarray or float

scyseq.operations.transform(seq, correspondance, new_alphabet=None)

Transforms the initial sequence according to the correspondence iterable.

Parameters:
  • seq (Sequence) – The sequence to transform.

  • correspondance (iterable) – A list or array representing correspondence to transfer current symbols.

  • new_alphabet (Alphabet, optional) – A new alphabet obj to use for the transformed sequence.

Returns:

The new mathematically transformed Sequence.

Return type:

Sequence

Example

>>> seq = Sequence([0, 2, 0, 1], 3)
>>> transform(seq, [1, 0, 0]).ivals.tolist()
[1, 0, 1, 0]
>>> alphabet = Alphabet(['low', 'high'])
>>> transformed = transform(seq, [1, 0, 0], alphabet)
>>> transformed.alphabet.svals
('low', 'high')
scyseq.operations.recode(lseq, new_alphabet=False, sep='+', names=None)

Recodes a list of sequences with (possibly) different alphabets but with the same length (This is an error to pass Sequences with different length.) A new dictionnary is built for the new sequence.

Parameters:
  • lseq (list) – A list of Sequence objects.

  • new_alphabet (bool, optional) – Whether to generate a new alphabet instead of integers, defaults to False.

  • sep (str, optional) – Separator to use if new_alphabet is True, defaults to ‘+’.

  • names (list, optional) – Optional names for the new alphabets.

Raises:

LengthError – When the length of the Sequences are different.

Returns:

A newly recoded Sequence object.

Return type:

Sequence

Example

>>> seq_a = Sequence([0, 0, 1, 1], 2)
>>> seq_b = Sequence([0, 1, 0, 1], 2)
>>> recoded = recode([seq_a, seq_b])
>>> recoded.ivals.tolist()
[0, 1, 2, 3]
>>> recoded.k
4
>>> named = recode([seq_a, seq_b], new_alphabet=True, names=['x', 'y'])
>>> named.alphabet.svals
('x_0+y_0', 'x_0+y_1', 'x_1+y_0', 'x_1+y_1')
scyseq.operations.words(seq, wlen, new_alphabet=False)

Returns a sequence encoded according to the m-words in seq

>>> seq = Sequence([0, 0, 1, 1, 0], 2)
>>> word_seq = words(seq, 2)
>>> word_seq.ivals.tolist()
[0, 1, 3, 2]
>>> word_seq.k
4

Discretisation and partition

scyseq.discretize.symbolize(arr, bins, d=None)

Convert an array of continuous values into a symbolic sequence using bins.

Parameters:
  • arr (numpy.ndarray) – Array of continuous values.

  • bins (array_like) – Array of bin edges.

  • d (dict, optional) – Optional dictionary (param kept for compatibility).

Returns:

A generated symbolic sequence based on the bins.

Return type:

Sequence

scyseq.discretize.partition(arr, method='histogram', nbin=10, d=None)

Discretize a continuous series according to method.

Methods are described in Hlavackova-Schindler et al. Physics Reports 441 (2007) 1–46 pages 14–19

method = ‘histogram’

simple histogram method with equidistant binning

method = ‘marginal_equiquantization’

marginal equiquantization ie does its best to let equal number of observation in each bin.

Parameters:
  • arr (numpy.ndarray) – A continuous series of values.

  • method (str) – A string in [“histogram”, “marginal_equiquantization”].

  • nbin (int) – The number of bins ie the length of the alphabet.

  • d (dict, optional) – A dictionary.

Raises:

NotImplementedError – If method is not in the list above.

Returns:

A symbolic Sequence.

Return type:

Sequence

Tests and examples of the functionnement of the module

>>> x = np.linspace(0,10,11)
>>> x
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])
>>> seq = partition(x, method='histogram', nbin=6)
>>> seq
Sequence: [0 0 1 1 2 3 3 4 4 5 5]
Alphabet(Symbol(0 | 0), Symbol(1 | 1), Symbol(2 | 2), Symbol(3 | 3), Symbol(4 | 4), Symbol(5 | 5))
N = 11 ; k = 6
>>> seq = partition(x, method='marginal_equiquantization',nbin=6)
>>> seq
Sequence: [0 0 1 1 2 2 3 3 4 4 5]
Alphabet(Symbol(0 | 0), Symbol(1 | 1), Symbol(2 | 2), Symbol(3 | 3), Symbol(4 | 4), Symbol(5 | 5))
N = 11 ; k = 6
scyseq.discretize.subdivision(data, iter_max)

Ulam method. Adaptive subdivision technique.

Based on: Set oriented numerical methods for dynamical systems Dellnitz M. and Junge O. Handbook of dynamical systems vol. 2 p. 221-264 Elsevier 2002.

and

Numerical approximation of random attractors Keller H. and Ochs G. in “Stochastic dynamics” Crauel H. and Gundlach M. Eds Springer 1999. p. 93-115

Parameters:
  • data (numpy.ndarray) – The input matrix/array to subdivide.

  • iter_max (int) – Maximum number of box iterations.

Returns:

A tuple containing (boxes, refs).

Return type:

tuple

scyseq.discretize.phase_cluster(data, nb_symb, target_dim=2)

This function provides the symbolic dynamic of a multivariate data It is based on the clusterisation of the “phase space” of the channels of MEG temporal signal

Parameters:
  • data (numpy.ndarray) – The input matrix, the lines are the channels and the columns are the time, must be an array.

  • nb_symb (int) – The number of bins used for the clusterisation i.e. the number of symbols of the symbolic sequences that will be created.

  • target_dim (int, optional) – The number of eigen vectors that we want to conserve to project our data on it.

Returns:

Return computed value (the clusterization result arrays).

Return type:

numpy.ndarray

Algorithmic complexity

Utilities for algorithmic complexity on symbolic sequences.

The public helpers in this module compute Lempel-Ziv complexities on integer-encoded symbolic sequences. lz76 and lz77 accept non-empty one-dimensional arrays of integers and can optionally return the parsing history used to count phrases. lempel_ziv works on scyseq.sequence.Sequence objects and returns either the raw or the normalized complexity score.

scyseq.algorithmic.contains_sublist(lst, sublst)

Return whether sublst appears contiguously inside lst.

Parameters:
  • lst – Sequence to inspect.

  • sublst – Candidate contiguous subsequence.

Returns:

True if sublst appears in lst, False otherwise.

The empty sublist is considered to be contained in every list.

>>> contains_sublist([1, 2, 3, 4], [2, 3])
True
>>> contains_sublist([1, 2, 3], [2, 4])
False
>>> contains_sublist([1, 2], [])
True
scyseq.algorithmic.lz76(arr, summary=False)

Return the Lempel-Ziv complexity obtained with the LZ76 parsing.

Parameters:
  • arr – Non-empty array-like object of integers.

  • summary – If True, also return the parsing history.

Returns:

Either an integer or a tuple (complexity, history) when summary=True.

Raises:

IndexError if arr is empty.

The returned history is the ordered list of phrases discovered during the parsing.

>>> arr = np.array([0, 0, 1, 0, 1, 1], dtype=np.uint8)
>>> lz76(arr)
3
>>> lz76(arr, summary=True)
(3, [[0], [0, 1], [0, 1, 1]])
scyseq.algorithmic.lz77(arr, summary=False)

Return the Lempel-Ziv complexity obtained with the LZ77 parsing.

Parameters:
  • arr – Non-empty array-like object of integers.

  • summary – If True, also return the parsing history.

Returns:

Either an integer or a tuple (complexity, history) when summary=True.

Raises:

IndexError if arr is empty.

The returned history is the ordered list of phrases discovered during the parsing.

>>> arr = np.array([0, 0, 1, 0, 1, 1], dtype=np.uint8)
>>> lz77(arr)
3
>>> lz77(arr, summary=True)
(3, [[0], [0, 1], [0, 1, 1]])
scyseq.algorithmic.lempel_ziv(seq, parsing='lz76', norm=False, nbsur=None)

Return the Lempel-Ziv complexity of a symbolic sequence.

Parameters:
  • seq – A symbolic scyseq.sequence.Sequence.

  • parsing – Parsing name in ["lz76", "lz77"].

  • norm – If True, normalize the raw score with surrogate sequences.

  • nbsur – Number of surrogate sequences used for normalization.

Returns:

A float complexity score.

Raises:

NotImplementedError if parsing is not implemented. ValueError if norm is True and nbsur is not provided.

>>> seq = S.Sequence(np.array([0, 0, 1, 0, 1, 1], dtype=np.uint8), 2)
>>> lempel_ziv(seq)
1.292481250360578
>>> np.random.seed(123)
>>> lempel_ziv(seq, norm=True, nbsur=5)
0.0

Stochastic and Markov

Defines stochastic matrices

scyseq.stochastic.conditional_matrix(dependent, conditioning, smooth=None)

Returns the conditional matrix ie P(y=j | x=i).

This is estimated using the maximum likelihood estimator.

P(x) = 0 is dealt with add-k smoothing:

\(P(y \mid x) = (P(x,y) + k) / (P(x) + k |A_y|)\)

with \(|A_y|\) the alphabet length of the dependent sequence

Parameters:
  • dependent – a symbolic Sequence object

  • conditioning – a symbolic Sequence object

  • smooth – smoothing with add-k (see below)

Returns:

A numpy.array of floats

If smooth is None: no smoothing is applied. Can lead to non-stochastic matrices with NaN due to P(x) = 0

If smooth == 0 and P(x) == 0 raises an exception (cannot compute the stochastic matrix)

NB: lines should sum to one (one should go somewhere) see markov_sequence in generate.py ie np.sum(matrix, axis=1) == [[1]…[1]]

Example :

>>> np.random.seed(9)
>>> a = np.random.choice([0,1],1000,replace=True, p=[0.7,0.3])
>>> np.random.seed(6)
>>> b = np.random.choice([0,1],1000,replace=True, p=[0.7,0.3])
>>> A = S.Alphabet(['a','b'])
>>> seq1 = S.Sequence(a,A)
>>> seq2 = S.Sequence(b,A)
>>> conditional_matrix(seq1, seq2)
array([[0.71947674, 0.28052326],
       [0.69551282, 0.30448718]])
scyseq.stochastic.transition_matrix(seq, time=1, smooth=0)

Returns the transition matrix.

This is estimated using the maximum likelihood estimator.

Parameters:

seq – a symbolic Sequence object

Returns:

A numpy.matrix of floats

Example :

>>> np.random.seed(9)
>>> a = np.random.choice([0,1],1000,replace=True, p=[0.7,0.3])
>>> A = S.Alphabet(['a','b'])
>>> seq = S.Sequence(a, A)
>>> transition_matrix(seq)
array([[0.70224719, 0.29775281],
       [0.73519164, 0.26480836]])
scyseq.stochastic.influence_matrix(seq1, seq2, time=1, smooth=0)

Returns the influence matrix ie P(x1(T+t)=j | x2(T)=i).

This is estimated using the maximum likelihood estimator.

Parameters:
  • seq1 – a symbolic Sequence object

  • seq2 – a symbolic Sequence object

Returns:

A numpy.matrix of floats

Example :

>>> np.random.seed(9)
>>> a = np.random.choice([0,1],1000,replace=True, p=[0.7,0.3])
>>> np.random.seed(6)
>>> b = np.random.choice([0,1],1000,replace=True, p=[0.7,0.3])
>>> A = S.Alphabet(['a','b'])
>>> seq1 = S.Sequence(a,A)
>>> seq2 = S.Sequence(b,A)
>>> influence_matrix(seq1, seq2)
array([[0.70887918, 0.29112082],
       [0.71794872, 0.28205128]])

Information theory

Information-theoretic measures for symbolic sequences.

This module gathers entropy, mutual-information, and transfer-entropy helpers for scyseq.sequence.Sequence objects. All logarithms are natural logarithms, so the returned values are expressed in nats.

scyseq.information.metric_entropy(seq)

Return Shannon’s metric entropy of a symbolic sequence.

Parameters:

seq – A symbolic scyseq.sequence.Sequence.

Returns:

A float entropy value in nats.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> metric_entropy(seq)
0.6003511877776578
scyseq.information.H(seq)

Return Shannon’s metric entropy of a symbolic sequence.

Parameters:

seq – A symbolic scyseq.sequence.Sequence.

Returns:

A float entropy value in nats.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> metric_entropy(seq)
0.6003511877776578
scyseq.information.shannon_entropy(seq)

Return Shannon’s metric entropy of a symbolic sequence.

Parameters:

seq – A symbolic scyseq.sequence.Sequence.

Returns:

A float entropy value in nats.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> metric_entropy(seq)
0.6003511877776578
scyseq.information.topological_entropy(seq)

Return the topological entropy of a symbolic sequence.

Parameters:

seq – A symbolic scyseq.sequence.Sequence.

Returns:

A float entropy value in nats.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> topological_entropy(seq)
0.6931471805599453
scyseq.information.T(seq)

Return the topological entropy of a symbolic sequence.

Parameters:

seq – A symbolic scyseq.sequence.Sequence.

Returns:

A float entropy value in nats.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> topological_entropy(seq)
0.6931471805599453
scyseq.information.renyi_entropy(seq, coef)

Return the Renyi entropy of order coef.

Parameters:
Returns:

A float entropy value in nats.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> renyi_entropy(seq, 0.9)
0.6088567303148161
scyseq.information.R(seq, coef)

Return the Renyi entropy of order coef.

Parameters:
Returns:

A float entropy value in nats.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> renyi_entropy(seq, 0.9)
0.6088567303148161
scyseq.information.block_entropy(seq, wlen)

Return the entropy of the overlapping words of length wlen.

Parameters:
Returns:

A float entropy value in nats.

Raises:

ValueError if wlen is invalid.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> block_entropy(seq, 6)
3.577559335188841
scyseq.information.entropy_rate(seq, wlen, method='average')

Return an entropy-rate estimate based on block entropies.

Parameters:
  • seq – A symbolic scyseq.sequence.Sequence.

  • wlen – Positive word length not exceeding len(seq).

  • method – One of ["average", "difference"].

Returns:

The entropy-rate estimate as a float.

Raises:

ValueError if wlen is invalid. NotImplementedError if method is unsupported.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> entropy_rate(seq, 6)
0.5962598891981402
>>> entropy_rate(seq, 6, method="difference")
0.5689107029836205
scyseq.information.effective_complexity(seq, n_max)

Return Grassberger’s effective complexity estimate.

Parameters:
Returns:

A float effective-complexity value.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq = sq.Sequence(a, alpha)
>>> effective_complexity(seq, 6)
0.05784767682768299
scyseq.information.mutual_information(seq1, seq2)

Return the mutual information between two symbolic sequences.

Parameters:
Returns:

The mutual information as a float.

Raises:

LengthError if the sequences do not have the same length.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> np.random.seed(6)
>>> b = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq1 = sq.Sequence(a, alpha)
>>> seq2 = sq.Sequence(b, alpha)
>>> mutual_information(seq1, seq2)
0.0002988020334349084
scyseq.information.multi_information(seq1, seq2, seq3)

Return the three-variable mutual information for symbolic sequences.

Parameters:
Returns:

The three-variable mutual information as a float.

Raises:

LengthError if the sequences do not have the same length.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> np.random.seed(6)
>>> b = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> np.random.seed(3)
>>> c = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq1 = sq.Sequence(a, alpha)
>>> seq2 = sq.Sequence(b, alpha)
>>> seq3 = sq.Sequence(c, alpha)
>>> multi_information(seq1, seq2, seq3)
-4.8757282800737656e-05
scyseq.information.transfer_entropy(seq1, seq1p, seq2)

Return the symbolic transfer entropy from seq2 to seq1.

seq1 corresponds to \(x_t\), seq1p to \(x_{t+1}\), and seq2 to \(y_t\).

Parameters:
  • seq1 – Target sequence at time \(t\).

  • seq1p – Shifted version of the target sequence at time \(t+1\).

  • seq2 – Driving sequence at time \(t\).

Returns:

The transfer entropy as a float.

Raises:

LengthError if the sequences do not have the same length.

>>> np.random.seed(9)
>>> a = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> np.random.seed(6)
>>> b = np.random.choice([0, 1], 1000, replace=True, p=[0.7, 0.3])
>>> alpha = sq.Alphabet(["a", "b"])
>>> seq1 = sq.Sequence(a, alpha)
>>> seq2 = sq.Sequence(b, alpha)
>>> transfer_entropy(seq1[:-1], seq1[1:], seq2[:-1])
0.00019242807727182232

Sequence generators

Generation of specified symbolic sequences

scyseq.generator.generate(method, N, k, *args)

Generates a Sequence according to a method

Parameters:
  • method (str) – a string in [“uniform”, “markov”, “binary_logistic”]

  • N (int) – the length of the sequence

  • k (int) – the length of the alphabet

  • args – supplementary parameters (transition matrix and order for “markov” and parameter for “binary_logisitic”)

Raises:

NotImplementedError if method is not in the list above.

Returns:

A Sequence object.

scyseq.generator.uniform_sequence(length, alen)

Returns an uniform random sequence.

Parameters:
  • N – the length of the sequence

  • k – the length of the alphabet

Returns:

a Sequence object

scyseq.generator.binary_map1d_sequence(length, map1d, xinit, threshold=0.5, skip=100)

Returns a binary sequence with a specified one-dimensional map dynamics

map1d can be specified such as: map1d = lambda x: 3.4 * x * (1 - x) or any function the defines x(t+1) as a function of x(t)

Parameters:
  • N – the length of the sequence

  • thresh – the threshold value to make a binary sequence

Returns:

A binary Sequence

scyseq.generator.binary_logistic_sequence(length, param, xinit, threshold=0.5, skip=100)

Returns a binary sequence with logistic dynamics according to the parameter \(\mu\).

The equation used here is: \(x(t+1) = \mu x (1-x)\)

Parameters:
  • N – the length of the sequence

  • mu – the paramter for the logistic equation

  • thresh – the threshold value to make a binary sequence

Returns:

A binary Sequence

scyseq.generator.markov_sequence(length, alen, markov_matrix, order)

Returns as sequence of a Markov process of order o with transition matrix M.

Parameters:
  • N – the length of the sequence

  • k – the length of the alphabet

  • M – the transition matrix

  • order – the order of the Markov process

Raises:

ValueError: if the shape of M does not correspond to the order of the process ie \(k^o imes k\)

Returns:

A sequence object

NB:

  • lines of Markov matrix give the probability to transition to one of the k symbols of the alphabet (so sum(markov_matrix[line] == 1) (ie np.sum(matrix, axis=1) == [[1]…[1]]

Input-output

scyseq.io.read_codix(fname, data_only=True)

Reads data file from the codix encoder of the codix software suite for behavioral studies.

returns in all cases a dictionary with data[‘site’][‘code’] = Sequence

Visualisation

scyseq.viz.get_state_colors(alphabet, cmap_name='viridis')

Get consistent color scheme for alphabet states. Returns a colormap and normalization that ensures each state gets the same color across plots.

Parameters:
  • alphabet – A symbolic Alphabet object

  • cmap_name – Name of matplotlib colormap (default: ‘viridis’)

Returns:

Tuple of (cmap, norm) for consistent coloring

scyseq.viz.plot(seq, xlabel='Time', ylabel='States', title='Simple plot', labelsize=15, titlesize=25, color='blue', **kwargs)

Simple (discrete / symbolic) time series plot

scyseq.viz.plot_bar(seq, xlabel='Time', ylabel='States', title='Bar plot', labelsize=15, titlesize=25, cmap_name='viridis', legend=False, legend_title='States', **kwargs)

Plots bar code like graph with consistent state colors.

scyseq.viz.plot_color(seq, aspect='auto', title='Sequence', xlabel='Time', labelsize=15, titlesize=25, cmap_name='viridis', figsize=(10, 2.4), legend=True, legend_title='States', **kwargs)

Plots a sequence as a color strip with consistent state colors.

scyseq.viz.plot_grid(seq1, seq2, xlabel='1st sequence', ylabel='2nd Sequence', title='Grid plot', labelsize=15, titlesize=25, color='blue', alpha=0.3, scale=100, jitter=0.4, **kwargs)

Plots state-space grids plots inspired from

Hollenstein T. (2013) State space grids. Springer.

scyseq.viz.plot_independence(seq1, seq2, xlabel='1st sequence', ylabel='2nd Sequence', title='Independence plot', labelsize=15, titlesize=25, color=('blue', 'red'), alpha=0.3, scale=100, **kwargs)

Plots state-space grids representing the elements of the mutual information between sequences.

Exceptions

Exception classes for the scyseq library.

exception scyseq.exceptions.ScyseqError

Base exception for all errors raised by the scyseq library.

__weakref__

list of weak references to the object

exception scyseq.exceptions.SymbolError

Base exception for symbol-related issues.

exception scyseq.exceptions.SymbolDefinitionError(value, msg)

Exception raised when a symbol cannot be defined.

__init__(value, msg)
exception scyseq.exceptions.SymbolAccessError(msg)

Exception raised when a symbol cannot be accessed.

__init__(msg)
exception scyseq.exceptions.AlphabetError

Base exception for alphabet-related issues.

exception scyseq.exceptions.AlphabetAccessError(msg)

Exception raised when an alphabet cannot be accessed.

__init__(msg)
exception scyseq.exceptions.InvalidSymbolError(symbol, alphabet)

Raised when an invalid symbol is used in an alphabet or sequence.

__init__(symbol, alphabet)
exception scyseq.exceptions.EmptyAlphabetError

Raised when attempting to use an empty alphabet.

__init__()
exception scyseq.exceptions.SequenceError

Base exception for sequence-related issues.

exception scyseq.exceptions.SequenceParseError(sequence, message='Unable to parse sequence.')

Raised when parsing a sequence fails due to invalid format.

__init__(sequence, message='Unable to parse sequence.')
exception scyseq.exceptions.LengthError(message)

Raised when sequence lengths do not match an operation’s requirements.

__init__(message)
exception scyseq.exceptions.SymbolMismatchError(sequence, invalid_symbols)

Raised when a sequence contains symbols not in the defined alphabet.

__init__(sequence, invalid_symbols)

Recurrence quantification

This module contains the functions for symbolic recurrence plots quantification.

Some references are:

Faure and Lesne (2010) Recurrence plots for symbolic sequence. International Journal of Bifurcation and Chaos

Zou et al. (2015) Identifying coupling directions by recurrences. In Recurrence Quantification Analysis.

scyseq.recurrence.recurrence(seq)

Compute a recurrence plot for the given sequence.

Parameters:

seq (Sequence) – The input symbolic sequence to calculate recurrence for.

Returns:

A 2D array representing the recurrence plot.

Return type:

numpy.ndarray