Parsers¶
A parser is used to parse financial documents such as account statements and transaction histories.
Bundled Parsers¶
NiveshPy comes bundled with a few useful parsers.
| Parser Name | Description | Command |
|---|---|---|
| CAS Parser | Parser for CAMS and Kfintech Consolidated Account Statements (CAS). | niveshpy parse cas ... |
Custom Parsers¶
You can create your own custom parser for NiveshPy with Python.
In your pyproject.toml file, include an entry point with the group niveshpy.parsers.
[project.entry-points."niveshpy.parsers"] # (1)
my_parser = "my_plugin:MyPluginFactory" # (2)
- This needs to be added as-is to ensure NiveshPy can find your parser.
- Replace
my_parserwith a unique key, and replace the string with a reference to yourPluginFactoryclass.
Create a class my_plugin.MyPluginFactory that follows the protocol ParserFactory.
This factory will be responsible for actually creating the parser object with the input file and password.
The create_parser method must return an instance of a class that follows the protocol Parser.
The get_parser_info method must return an object of type ParserInfo
Usage¶
To use a custom parser, install it in the same virtual environment as NiveshPy and run the command:
niveshpy parse my_parser
NiveshPy will use the name defined in your pyproject.toml as a unique key (in this example, my_parser).
Warning
If another installed parser has the same key, your parser may be overwritten. If this happens, a warning will be logged. If you find yourself in this situation, you can change your parser key or advise the user to uninstall the other parser.
Shell Completion¶
NiveshPy CLI supports shell completion. If the user types the partial command niveshpy parse ... and presses Tab, the CLI will look for parsers with keys starting with the partial key entered by the user (or all parsers if no key is provided).
Depending on the terminal, the CLI will show a list of all such keys along with the name defined in your ParserInfo.name
Tip
For this reason, it is recommended that your parser factory return a ParserInfo object quickly. Do not write any initialization code in your parser factory. Any initialization code can be placed in the create_parser method as that will only be called after the user has run the parse command.
Example¶
Example custom parser
class SampleParser:
def __init__(self, file_path: str):
with open(file_path) as f:
self.data = json.loads(f.read())
def get_date_range(self) -> tuple[datetime.date, datetime.date]:
return self.data.start_date, self.data.end_date # (1)
def get_accounts(self) -> list[AccountCreate]:
return [
AccountCreate(name=acc.name, institution=acc.org, properties={"source": "sample"}) # (2)
for acc in self.data.accounts
]
def get_securities(self) -> Iterable[SecurityCreate]:
for acc in self.data.accounts:
for sec in acc.securities:
yield SecurityCreate(
key=sec.key,
name=sec.name,
type=SecurityType.OTHER,
category=SecurityCategory.OTHER,
properties={"source": "sample", "isin": sec.isin}, # (3)
)
def get_transactions(
self, accounts: Iterable[AccountPublic]
) -> Iterable[TransactionCreate]:
accounts_map = {(acc.name, acc.institution): acc.id for acc in accounts}
for acc in self.data.accounts:
account_id = accounts_map.get((acc.name, acc.org))
for sec in acc.securities:
for transaction in sec.transactions:
txn_type = TransactionType(transaction.type.lower())
txn = TransactionCreate(
transaction_date=transaction.date,
type=txn_type,
description=transaction.description,
amount=transaction.amount,
units=transaction.units,
security_key=sec.key,
account_id=account_id,
properties={"source": "sample"},
)
yield txn
class SampleParserFactory:
@classmethod
def get_parser_info(cls) -> ParserInfo:
return ParserInfo(
name="Sample Parser",
description="Sample parser.",
file_extensions=[".json"],
password_required=False, # (4)
)
@classmethod
def create_parser(
cls, file_path: Path, password: str | None = None
) -> SampleParser:
file_path = file_path.as_posix()
return SampleParser(file_path)
- Transactions for all provided security-account combinations in the given date-range will be overwritten. This is to ensure transactions remain accurate and current. Keep this in mind when returning these dates.
- It's a good practice to include a
sourcekey in the metadata dictionary for any object you are returning. - You can also include other key-value pairs in the metadata dictionary that may be relevant.
- If password_required is True, the user will be prompted for a password.
The above example is for illustrative purposes only. The code above may need to be modified to work.
API Reference¶
ParserInfo
dataclass
¶
ParserInfo(name: str, description: str, file_extensions: list[str], password_required: bool = False)
Model for parser metadata.
Parser
¶
Bases: Protocol
Protocol for parser classes.
get_accounts
¶
get_accounts() -> Iterable[AccountCreate]
Get the list of accounts from the parser.
Returns:
| Type | Description |
|---|---|
Iterable[AccountCreate]
|
An iterable of AccountCreate objects representing the accounts found in the data. |
Source code in niveshpy/models/parser.py
45 46 47 48 49 50 51 | |
get_date_range
¶
get_date_range() -> tuple[date, date]
Get the date range of the parsed data.
Returns:
| Type | Description |
|---|---|
tuple[date, date]
|
A tuple containing the start and end dates of the data. |
Source code in niveshpy/models/parser.py
37 38 39 40 41 42 43 | |
get_securities
¶
get_securities() -> Iterable[SecurityCreate]
Get the list of securities from the parser.
Returns:
| Type | Description |
|---|---|
Iterable[SecurityCreate]
|
An iterable of SecurityCreate objects representing the securities found in the data. |
Source code in niveshpy/models/parser.py
53 54 55 56 57 58 59 | |
get_transactions
¶
get_transactions(accounts: Iterable[AccountPublic]) -> Iterable[TransactionCreate]
Get the list of transactions from the parser.
The returned transactions should reference the provided accounts and the securities created earlier.
Ensure all valid transactions in the (earlier provided) date-range are included. The service will overwrite all transactions for the referenced account-security pairs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
accounts
|
Iterable[AccountPublic]
|
An iterable of AccountPublic objects representing the accounts to reference. |
required |
Returns:
| Type | Description |
|---|---|
Iterable[TransactionCreate]
|
An iterable of TransactionCreate objects representing the transactions found in the data. |
Source code in niveshpy/models/parser.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | |
ParserFactory
¶
Bases: Protocol
Protocol for parser factory classes.
create_parser
classmethod
¶
create_parser(file_path: Path, password: str | None = None, **kwargs) -> Parser
Create a parser instance for the given file.
Returns:
| Type | Description |
|---|---|
Parser
|
An instance of a Parser that can handle the given file. |
Source code in niveshpy/models/parser.py
83 84 85 86 87 88 89 90 91 92 | |
get_parser_info
classmethod
¶
get_parser_info() -> ParserInfo
Get metadata about the parser.
Returns:
| Type | Description |
|---|---|
ParserInfo
|
A ParserInfo object containing metadata about the parser. |
Source code in niveshpy/models/parser.py
94 95 96 97 98 99 100 101 | |