Skip to content

Parsers

A parser is used to parse financial documents such as account statements and transaction histories.

Bundled Parsers

NiveshPy comes bundled with a few useful parsers.

Parser Name Description Command
CAS Parser Parser for CAMS and Kfintech Consolidated Account Statements (CAS). niveshpy parse cas ...

Custom Parsers

You can create your own custom parser for NiveshPy with Python.

In your pyproject.toml file, include an entry point with the group niveshpy.parsers.

[project.entry-points."niveshpy.parsers"] # (1)
my_parser = "my_plugin:MyPluginFactory" # (2)
  1. This needs to be added as-is to ensure NiveshPy can find your parser.
  2. Replace my_parser with a unique key, and replace the string with a reference to your PluginFactory class.

Create a class my_plugin.MyPluginFactory that follows the protocol ParserFactory. This factory will be responsible for actually creating the parser object with the input file and password.

The create_parser method must return an instance of a class that follows the protocol Parser.

The get_parser_info method must return an object of type ParserInfo

Usage

To use a custom parser, install it in the same virtual environment as NiveshPy and run the command:

niveshpy parse my_parser

NiveshPy will use the name defined in your pyproject.toml as a unique key (in this example, my_parser).

Warning

If another installed parser has the same key, your parser may be overwritten. If this happens, a warning will be logged. If you find yourself in this situation, you can change your parser key or advise the user to uninstall the other parser.

Shell Completion

NiveshPy CLI supports shell completion. If the user types the partial command niveshpy parse ... and presses Tab, the CLI will look for parsers with keys starting with the partial key entered by the user (or all parsers if no key is provided). Depending on the terminal, the CLI will show a list of all such keys along with the name defined in your ParserInfo.name

Tip

For this reason, it is recommended that your parser factory return a ParserInfo object quickly. Do not write any initialization code in your parser factory. Any initialization code can be placed in the create_parser method as that will only be called after the user has run the parse command.

Example

Example custom parser
class SampleParser:
    def __init__(self, file_path: str):
        with open(file_path) as f:
            self.data = json.loads(f.read())
    def get_date_range(self) -> tuple[datetime.date, datetime.date]:
        return self.data.start_date, self.data.end_date  # (1)
    def get_accounts(self) -> list[AccountWrite]:
        return [
            AccountWrite(acc.name, acc.org, {"source": "sample"})  # (2)
            for acc in self.data.accounts
        ]
    def get_securities(self) -> Iterable[SecurityWrite]:
        for acc in self.data.accounts:
            for sec in acc.securities:
                yield SecurityWrite(
                    sec.key,
                    sec.name,
                    SecurityType.OTHER,
                    SecurityCategory.OTHER,
                    metadata={"source": "sample", "isin": sec.isin},  # (3)
                )
    def get_transactions(
        self, accounts: Iterable[AccountRead]
    ) -> Iterable[TransactionWrite]:
        accounts_map = {(acc.name, acc.institution): acc.id for acc in accounts}
        for acc in self.data.accounts:
            account_id = accounts_map.get((acc.name, acc.org))
            for sec in acc.securities:
                for transaction in sec.transactions:
                    txn_type = TransactionType(transaction.type.lower())
                    txn = TransactionWrite(
                        transaction_date=transaction.date,
                        type=txn_type,
                        description=transaction.description,
                        amount=transaction.amount,
                        units=transaction.units,
                        security_key=sec.key,
                        account_id=account_id,
                        metadata={"source": "sample"},
                    )
                    yield txn
class SampleParserFactory:
    @classmethod
    def get_parser_info(cls) -> ParserInfo:
        return ParserInfo(
            name="Sample Parser",
            description="Sample parser.",
            file_extensions=[".json"],
            password_required=False,  # (4)
        )
    @classmethod
    def create_parser(
        cls, file_path: Path, password: str | None = None
    ) -> SampleParser:
        file_path = file_path.as_posix()
        return SampleParser(file_path)
    @classmethod
    def can_parse(cls, file_path: Path):
        return file_path.match("*.json")
  1. Transactions for all provided security-account combinations in the given date-range will be overwritten. This is to ensure transactions remain accurate and current. Keep this in mind when returning these dates.
  2. It's a good practice to include a source key in the metadata dictionary for any object you are returning.
  3. You can also include other key-value pairs in the metadata dictionary that may be relevant.
  4. If password_required is True, the user will be prompted for a password.

The above example is for illustrative purposes only. The code above may need to be modified to work.

API Reference

ParserInfo(name, description, file_extensions, password_required=False) dataclass

Model for parser metadata.

description instance-attribute

Brief description of what the parser does.

file_extensions instance-attribute

List of supported file extensions for the parser (e.g. ['.csv', '.json']).

name instance-attribute

Human-readable name of the parser.

password_required = False class-attribute instance-attribute

Indicates if the parser requires a password to parse files.

Parser

Bases: Protocol

Protocol for parser classes.

get_accounts()

Get the list of accounts from the parser.

Returns:

Type Description
Iterable[AccountWrite]

An iterable of AccountWrite objects representing the accounts found in the data.

Source code in niveshpy/models/parser.py
42
43
44
45
46
47
48
def get_accounts(self) -> Iterable[AccountWrite]:
    """Get the list of accounts from the parser.

    Returns:
        An iterable of AccountWrite objects representing the accounts found in the data.
    """
    ...

get_date_range()

Get the date range of the parsed data.

Returns:

Type Description
tuple[date, date]

A tuple containing the start and end dates of the data.

Source code in niveshpy/models/parser.py
34
35
36
37
38
39
40
def get_date_range(self) -> tuple[datetime.date, datetime.date]:
    """Get the date range of the parsed data.

    Returns:
        A tuple containing the start and end dates of the data.
    """
    ...

get_securities()

Get the list of securities from the parser.

Returns:

Type Description
Iterable[SecurityWrite]

An iterable of SecurityWrite objects representing the securities found in the data.

Source code in niveshpy/models/parser.py
50
51
52
53
54
55
56
def get_securities(self) -> Iterable[SecurityWrite]:
    """Get the list of securities from the parser.

    Returns:
        An iterable of SecurityWrite objects representing the securities found in the data.
    """
    ...

get_transactions(accounts)

Get the list of transactions from the parser.

The returned transactions should reference the provided accounts and the securities created earlier.

Ensure all valid transactions in the (earlier provided) date-range are included. The service will overwrite all transactions for the referenced account-security pairs.

Parameters:

Name Type Description Default
accounts Iterable[AccountRead]

An iterable of AccountRead objects representing the accounts to reference.

required

Returns:

Type Description
Iterable[TransactionWrite]

An iterable of TransactionWrite objects representing the transactions found in the data.

Source code in niveshpy/models/parser.py
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def get_transactions(
    self, accounts: Iterable[AccountRead]
) -> Iterable[TransactionWrite]:
    """Get the list of transactions from the parser.

    The returned transactions should reference the provided accounts and the securities created earlier.

    Ensure all valid transactions in the (earlier provided) date-range are included.
    The service will overwrite all transactions for the referenced account-security pairs.

    Args:
        accounts: An iterable of AccountRead objects representing the accounts to reference.

    Returns:
        An iterable of TransactionWrite objects representing the transactions found in the data.
    """
    ...

ParserFactory

Bases: Protocol

Protocol for parser factory classes.

can_parse(file_path) classmethod

Check if the parser can handle the given file based on its extension.

Parameters:

Name Type Description Default
file_path Path

The path to the file to check.

required

Returns:

Type Description
bool

True if the parser can handle the file, False otherwise.

Source code in niveshpy/models/parser.py
100
101
102
103
104
105
106
107
108
109
110
@classmethod
def can_parse(cls, file_path: Path) -> bool:
    """Check if the parser can handle the given file based on its extension.

    Args:
        file_path: The path to the file to check.

    Returns:
        True if the parser can handle the file, False otherwise.
    """
    ...

create_parser(file_path, password=None, **kwargs) classmethod

Create a parser instance for the given file.

Returns:

Type Description
Parser

An instance of a Parser that can handle the given file.

Source code in niveshpy/models/parser.py
80
81
82
83
84
85
86
87
88
89
@classmethod
def create_parser(
    self, file_path: Path, password: str | None = None, **kwargs
) -> Parser:
    """Create a parser instance for the given file.

    Returns:
        An instance of a Parser that can handle the given file.
    """
    ...

get_parser_info() classmethod

Get metadata about the parser.

Returns:

Type Description
ParserInfo

A ParserInfo object containing metadata about the parser.

Source code in niveshpy/models/parser.py
91
92
93
94
95
96
97
98
@classmethod
def get_parser_info(cls) -> ParserInfo:
    """Get metadata about the parser.

    Returns:
        A ParserInfo object containing metadata about the parser.
    """
    ...