Parsers¶
A parser is used to parse financial documents such as account statements and transaction histories.
Bundled Parsers¶
NiveshPy comes bundled with a few useful parsers.
| Parser Name | Description | Command |
|---|---|---|
| CAS Parser | Parser for CAMS and Kfintech Consolidated Account Statements (CAS). | niveshpy parse cas ... |
Custom Parsers¶
You can create your own custom parser for NiveshPy with Python.
In your pyproject.toml file, include an entry point with the group niveshpy.parsers.
[project.entry-points."niveshpy.parsers"] # (1)
my_parser = "my_plugin:MyPluginFactory" # (2)
- This needs to be added as-is to ensure NiveshPy can find your parser.
- Replace
my_parserwith a unique key, and replace the string with a reference to yourPluginFactoryclass.
Create a class my_plugin.MyPluginFactory that follows the protocol ParserFactory.
This factory will be responsible for actually creating the parser object with the input file and password.
The create_parser method must return an instance of a class that follows the protocol Parser.
The get_parser_info method must return an object of type ParserInfo
Usage¶
To use a custom parser, install it in the same virtual environment as NiveshPy and run the command:
niveshpy parse my_parser
NiveshPy will use the name defined in your pyproject.toml as a unique key (in this example, my_parser).
Warning
If another installed parser has the same key, your parser may be overwritten. If this happens, a warning will be logged. If you find yourself in this situation, you can change your parser key or advise the user to uninstall the other parser.
Shell Completion¶
NiveshPy CLI supports shell completion. If the user types the partial command niveshpy parse ... and presses Tab, the CLI will look for parsers with keys starting with the partial key entered by the user (or all parsers if no key is provided).
Depending on the terminal, the CLI will show a list of all such keys along with the name defined in your ParserInfo.name
Tip
For this reason, it is recommended that your parser factory return a ParserInfo object quickly. Do not write any initialization code in your parser factory. Any initialization code can be placed in the create_parser method as that will only be called after the user has run the parse command.
Example¶
Example custom parser
class SampleParser:
def __init__(self, file_path: str):
with open(file_path) as f:
self.data = json.loads(f.read())
def get_date_range(self) -> tuple[datetime.date, datetime.date]:
return self.data.start_date, self.data.end_date # (1)
def get_accounts(self) -> list[AccountWrite]:
return [
AccountWrite(acc.name, acc.org, {"source": "sample"}) # (2)
for acc in self.data.accounts
]
def get_securities(self) -> Iterable[SecurityWrite]:
for acc in self.data.accounts:
for sec in acc.securities:
yield SecurityWrite(
sec.key,
sec.name,
SecurityType.OTHER,
SecurityCategory.OTHER,
metadata={"source": "sample", "isin": sec.isin}, # (3)
)
def get_transactions(
self, accounts: Iterable[AccountRead]
) -> Iterable[TransactionWrite]:
accounts_map = {(acc.name, acc.institution): acc.id for acc in accounts}
for acc in self.data.accounts:
account_id = accounts_map.get((acc.name, acc.org))
for sec in acc.securities:
for transaction in sec.transactions:
txn_type = TransactionType(transaction.type.lower())
txn = TransactionWrite(
transaction_date=transaction.date,
type=txn_type,
description=transaction.description,
amount=transaction.amount,
units=transaction.units,
security_key=sec.key,
account_id=account_id,
metadata={"source": "sample"},
)
yield txn
class SampleParserFactory:
@classmethod
def get_parser_info(cls) -> ParserInfo:
return ParserInfo(
name="Sample Parser",
description="Sample parser.",
file_extensions=[".json"],
password_required=False, # (4)
)
@classmethod
def create_parser(
cls, file_path: Path, password: str | None = None
) -> SampleParser:
file_path = file_path.as_posix()
return SampleParser(file_path)
@classmethod
def can_parse(cls, file_path: Path):
return file_path.match("*.json")
- Transactions for all provided security-account combinations in the given date-range will be overwritten. This is to ensure transactions remain accurate and current. Keep this in mind when returning these dates.
- It's a good practice to include a
sourcekey in the metadata dictionary for any object you are returning. - You can also include other key-value pairs in the metadata dictionary that may be relevant.
- If password_required is True, the user will be prompted for a password.
The above example is for illustrative purposes only. The code above may need to be modified to work.
API Reference¶
ParserInfo(name, description, file_extensions, password_required=False)
dataclass
¶
Model for parser metadata.
description
instance-attribute
¶
Brief description of what the parser does.
file_extensions
instance-attribute
¶
List of supported file extensions for the parser (e.g. ['.csv', '.json']).
name
instance-attribute
¶
Human-readable name of the parser.
password_required = False
class-attribute
instance-attribute
¶
Indicates if the parser requires a password to parse files.
Parser
¶
Bases: Protocol
Protocol for parser classes.
get_accounts()
¶
Get the list of accounts from the parser.
Returns:
| Type | Description |
|---|---|
Iterable[AccountWrite]
|
An iterable of AccountWrite objects representing the accounts found in the data. |
Source code in niveshpy/models/parser.py
42 43 44 45 46 47 48 | |
get_date_range()
¶
Get the date range of the parsed data.
Returns:
| Type | Description |
|---|---|
tuple[date, date]
|
A tuple containing the start and end dates of the data. |
Source code in niveshpy/models/parser.py
34 35 36 37 38 39 40 | |
get_securities()
¶
Get the list of securities from the parser.
Returns:
| Type | Description |
|---|---|
Iterable[SecurityWrite]
|
An iterable of SecurityWrite objects representing the securities found in the data. |
Source code in niveshpy/models/parser.py
50 51 52 53 54 55 56 | |
get_transactions(accounts)
¶
Get the list of transactions from the parser.
The returned transactions should reference the provided accounts and the securities created earlier.
Ensure all valid transactions in the (earlier provided) date-range are included. The service will overwrite all transactions for the referenced account-security pairs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
accounts
|
Iterable[AccountRead]
|
An iterable of AccountRead objects representing the accounts to reference. |
required |
Returns:
| Type | Description |
|---|---|
Iterable[TransactionWrite]
|
An iterable of TransactionWrite objects representing the transactions found in the data. |
Source code in niveshpy/models/parser.py
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | |
ParserFactory
¶
Bases: Protocol
Protocol for parser factory classes.
can_parse(file_path)
classmethod
¶
Check if the parser can handle the given file based on its extension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
Path
|
The path to the file to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if the parser can handle the file, False otherwise. |
Source code in niveshpy/models/parser.py
100 101 102 103 104 105 106 107 108 109 110 | |
create_parser(file_path, password=None, **kwargs)
classmethod
¶
Create a parser instance for the given file.
Returns:
| Type | Description |
|---|---|
Parser
|
An instance of a Parser that can handle the given file. |
Source code in niveshpy/models/parser.py
80 81 82 83 84 85 86 87 88 89 | |
get_parser_info()
classmethod
¶
Get metadata about the parser.
Returns:
| Type | Description |
|---|---|
ParserInfo
|
A ParserInfo object containing metadata about the parser. |
Source code in niveshpy/models/parser.py
91 92 93 94 95 96 97 98 | |