pyliblinear package

Module contents

Copyright:

Copyright 2015 - 2017 André Malo or his licensors, as applicable

License:

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

pyliblinear - a liblinear python API

pyliblinear - a liblinear python API

class pyliblinear.FeatureMatrix

Bases: object

Feature matrix to be used for training or prediction.

features(self)

Return the features as iterator of dicts.

Return:The feature vectors
Rtype:iterable
from_iterables(cls, labels, features)

Create FeatureMatrix instance from a two separated iterables - labels and features.

Parameters:
labels : iterable

Iterable providing the labels per feature vector (assigned by order)

features : iterable

Iterable providing the feature vector per label (assigned by order)

Return:

New feature matrix instance

Rtype:

FeatureMatrix

Exceptions:
  • ValueError : The lengths of the iterables differ
height

The matrix height (number of labels and vectors).

Type:int
labels(self)

Return the labels as iterator.

Return:The labels
Rtype:iterable
load(cls, file)

Create FeatureMatrix instance from a file.

Each line of the file contains the label and the accompanying sparse feature vector, separated by a space/tab sequence. The feature vector consists of index/value pairs. The index and the value are separated by a colon (:). The pairs are separated by space/tab sequences. Accepted line endings are \r, \n and \r\n.

All numbers are represented as strings parsable either as ints (for indexes) or doubles (for values and labels).

Note that the exact I/O exceptions depend on the stream passed in.

Parameters:
file : file or str

Either a readable stream or a filename. If the passed object provides a read attribute/method, it’s treated as readable file stream, as a filename otherwise. If it’s a stream, the stream is read from the current position and remains open after hitting EOF. In case of a filename, the accompanying file is opened in text mode, read from the beginning and closed afterwards.

Return:

New feature matrix instance

Rtype:

FeatureMatrix

Exceptions:
  • IOError : Error reading the file
  • ValueError : Error parsing the file
save(self, file)

Save FeatureMatrix instance to a file.

Each line of the file contains the label and the accompanying sparse feature vector, separated by a space. The feature vector consists of index/value pairs. The index and the value are separated by a colon (:). The pairs are separated by a space again. The line ending is \n.

All numbers are represented as strings parsable either as ints (for indexes) or doubles (for values and labels).

Note that the exact I/O exceptions depend on the stream passed in.

Parameters:
file : file or str

Either a writeable stream or a filename. If the passed object provides a write attribute/method, it’s treated as writeable stream, as a filename otherwise. If it’s a stream, the stream is written to the current position and remains open when done. In case of a filename, the accompanying file is opened in text mode, truncated, written from the beginning and closed afterwards.

Exceptions:
  • IOError : Error writing the file
width

The matrix width (number of features).

Type:int
class pyliblinear.Solver

Bases: object

Solver container

C

The configured C parameter.

Type:float
eps

The configured eps parameter.

Type:float
p

The configured p parameter.

Type:float
type

The configured solver type.

Type:str
weights(self)

Return the configured weights as a dict (label -> weight).

Return:The weights (maybe empty)
Rtype:dict
class pyliblinear.Model

Bases: object

Classification model. Use its Model.load or Model.train methods to construct a new instance

bias

Bias used to create the model

None if no bias was applied.

Type:double
is_probability

Is model a probability model?

Type:bool
is_regression

Is model a regression model?

Type:bool
load(cls, file, mmap=False)

Create Model instance from a file (previously created by Model.save())

Note that the exact I/O exceptions depend on the stream passed in.

Parameters:
file : file or str

Either a readable stream or a filename. If the passed object provides a read attribute/method, it’s treated as readable file stream, as a filename otherwise. If it’s a stream, the stream is read from the current position and remains open after hitting EOF. In case of a filename, the accompanying file is opened in text mode, read from the beginning and closed afterwards.

mmap : bool

Load the model into a file-backed memory area? Default: false

Return:

New model instance

Rtype:

Model

Exceptions:
  • IOError : Error reading the file
  • ValueError : Error parsing the file
predict(self, matrix, label_only=True, probability=False)

Run the model on matrix and predict labels.

Parameters:
matrix : pyliblinear.FeatureMatrix or iterable

Either a feature matrix or a simple iterator over feature vectors to inspect and predict upon.

label_only : bool

Return the label only? If false, the decision dict for all labels is returned as well.

probability : bool

Use probability estimates?

Return:

Result iterator. Either over labels or over label/decision dict tuples.

Rtype:

iterable

save(self, file)

Save Model instance to a file.

After some basic information about solver type, dimensions and labels the model matrix is stored as a sequence of doubles per line. The matrix is transposed, so the height is the number of features (including the bias feature) and the width is the number of classes.

All numbers are represented as strings parsable either as ints (for dimensions and labels) or doubles (other values).

Note that the exact I/O exceptions depend on the stream passed in.

Parameters:
file : file or str

Either a writeable stream or a filename. If the passed object provides a write attribute/method, it’s treated as writeable stream, as a filename otherwise. If it’s a stream, the stream is written to the current position and remains open when done. In case of a filename, the accompanying file is opened in text mode, truncated, written from the beginning and closed afterwards.

Exceptions:
  • IOError : Error writing the file
solver_type

Solver type used to create the model

Type:str
train(cls, matrix, solver=None, bias=None)

Create model instance from a training run

Parameters:
matrix : pyliblinear.FeatureMatrix

Feature matrix to use for training

solver : pyliblinear.Solver

Solver instance. If omitted or None, a default solver is picked.

bias : float

Bias to the hyperplane. Of omitted or None, no bias is applied. bias >= 0.

Return:

New model instance

Rtype:

Model