Skip to content

Commit

Permalink
ci(pypi): Add integration with Github
Browse files Browse the repository at this point in the history
  • Loading branch information
Elijas committed Sep 27, 2023
1 parent a25a3d5 commit 1967228
Show file tree
Hide file tree
Showing 5 changed files with 113 additions and 1 deletion.
1 change: 1 addition & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,6 @@ jobs:
NEW_VERSION="${POETRY_VERSION}.post${{ github.run_number }}"
poetry version "${NEW_VERSION}"
fi
poetry run task generate-pypi-readme
poetry build
poetry publish --username __token__ --password ${{ secrets.PUBLIC_PYPI_API_TOKEN }}
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,10 @@
<a href="http://hits.dwyl.com/alphanome-ai/sec-parser"><img src="https://img.shields.io/endpoint?url=https%3A%2F%2Fhits.dwyl.com%2Falphanome-ai%2Fsec-parser.json%3Fshow%3Dunique" alt="HitCount" /></a>
<a href="https://twitter.com/alphanomeai"><img alt="X (formerly Twitter) Follow" src="https://img.shields.io/twitter/follow/alphanomeai"></a>
<a href="https://github.com/JonSnow/MyBadges"><img src="https://img.shields.io/github/stars/alphanome-ai/sec-parser.svg?style=social&label=Star" alt="GitHub stars"></a>


</p>

<div align="left">
Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual structure of the document.
</div>
Expand All @@ -43,7 +46,6 @@

## Overview


The `sec-parser` project simplifies the process of extracting meaningful information from SEC EDGAR HTML documents. It organizes the document's source code into a list or tree of elements that correspond to the visual structure of the document. This includes distinct elements for section titles, paragraphs, and tables, making the data easier to analyze and understand.

This tool is especially beneficial for Artificial Intelligence (AI) and Large Language Models (LLM) applications. It significantly improves the efficiency of data extraction and analysis in these fields.
Expand Down
24 changes: 24 additions & 0 deletions Taskfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,27 @@ tasks:
cmds:
# PYTHONPATH is added to make streamlit watch file changes. Read more: https://docs.streamlit.io/knowledge-base/using-streamlit/streamlit-watch-changes-other-modules-importing-app
- export PYTHONPATH=$PYTHONPATH:$(pwd)/sec_parser; poetry run streamlit run debug_tools/parser_output_visualizer/app.py --server.runOnSave=true

########################
### CI/CD Automation ###
########################

generate-pypi-readme:
desc: Generate the README file for PyPI.
cmds:
# Copy the main README.md to docs/README-pypi.md for PyPI.
- cp README.md docs/README-pypi.md

# Check for the existence of the image tag in the newly copied README. Exit with error if not found.
# This will make sure the CI/CD fails, drawing attention to the problem.
- grep -q '<img src=\"docs/title.png\" alt=\"sec-parser logo\" height=\"40\"/>' docs/README-pypi.md || { echo "String not found!"; exit 1; }

# Replace the image tag with 'SEC-PARSER' text.
# Output the result to a temporary file, avoiding in-place issues between macOS and Linux.
- sed 's/<img src=\"docs\/title.png\" alt=\"sec-parser logo\" height=\"40\"\/>/SEC-PARSER/g' docs/README-pypi.md > docs/README-pypi.md.tmp

# Replace repeating occurrences of &nbsp; with a single &nbsp;. This command is compatible with both macOS and Linux.
- if [[ "$OSTYPE" == "darwin"* ]]; then sed -E 's/(&nbsp;)+/\1/g' docs/README-pypi.md.tmp > docs/README-pypi.md; else sed -r 's/(&nbsp;)+/\1/g' docs/README-pypi.md.tmp > docs/README-pypi.md; fi

# Remove the temporary file, as it's no longer needed.
- rm docs/README-pypi.md.tmp
84 changes: 84 additions & 0 deletions docs/README-pypi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
<p align="center">&nbsp;</p>
<p align="center">
<h1 align="center">SEC-PARSER</h1>
</p>
<p align="left">
<!-- Using &nbsp; for alignment due to GitHub README limitations -->
<b>Essentials ➔&nbsp;</b>
<a href="LICENSE.txt"><img src="https://img.shields.io/github/license/alphanome-ai/sec-parser.svg" alt="Licence"></a>
<a href="https://project-types.github.io/#federation"><img src="https://img.shields.io/badge/project%20type-federation-brightgreen" alt="Project Type: Federation"></a>
<a href="https://github.com/mkenney/software-guides/blob/master/STABILITY-BADGES.md#experimental"><img src="https://img.shields.io/badge/stability-experimental-orange.svg" alt="Experimental"></a>
<br>
<b>Health ➔&nbsp;</b>
<a href="https://github.com/alphanome-ai/sec-parser/actions/workflows/check.yml"><img alt="GitHub Workflow Status (with event)" src="https://img.shields.io/github/actions/workflow/status/alphanome-ai/sec-parser/check.yml"></a>
<a href="https://github.com/alphanome-ai/sec-parser/commits/main"><img alt="Last Commit" src="https://img.shields.io/github/last-commit/alphanome-ai/sec-parser"></a>
<br>
<b>Quality ➔&nbsp;</b>
<a href="https://codecov.io/gh/alphanome-ai/sec-parser"><img src="https://codecov.io/gh/alphanome-ai/sec-parser/graph/badge.svg?token=KJLA96CBCN" alt="codecov" /></a>
<a href="https://mypy-lang.org/"><img src="https://img.shields.io/badge/type%20checked-mypy-blue.svg"></a>
<a href="https://github.com/psf/black"><img alt="Code Style: Black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
<a href="https://github.com/astral-sh/ruff"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json" alt="Ruff"></a>
<br>
<b>Distribution ➔&nbsp;</b>
<a href="https://badge.fury.io/py/sec-parser"><img src="https://badge.fury.io/py/sec-parser.svg" alt="PyPI version" /></a>
<a href="https://pypi.org/project/sec-parser/"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/sec-parser"></a>
<a href="https://pypistats.org/packages/sec-parser"><img src="https://img.shields.io/pypi/dm/sec-parser.svg" alt="PyPI downloads"></a>
<br>
<b>Community ➔&nbsp;</b>
<a href="http://hits.dwyl.com/alphanome-ai/sec-parser"><img src="https://img.shields.io/endpoint?url=https%3A%2F%2Fhits.dwyl.com%2Falphanome-ai%2Fsec-parser.json%3Fshow%3Dunique" alt="HitCount" /></a>
<a href="https://twitter.com/alphanomeai"><img alt="X (formerly Twitter) Follow" src="https://img.shields.io/twitter/follow/alphanomeai"></a>
<a href="https://github.com/JonSnow/MyBadges"><img src="https://img.shields.io/github/stars/alphanome-ai/sec-parser.svg?style=social&label=Star" alt="GitHub stars"></a>


</p>

<div align="left">
Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual structure of the document.
</div>
<div align="center">
<b>
<a href="https://sec-parser-output-visualizer.app.alphanome.dev">Demo</a> |
<a href="https://github.com/alphanome-ai/sec-parser/discussions">Discussions</a> |
<a href="https://github.com/alphanome-ai/sec-parser/issues">Issues</a>
</b>
</div>
<br>

## Overview

The `sec-parser` project simplifies the process of extracting meaningful information from SEC EDGAR HTML documents. It organizes the document's source code into a list or tree of elements that correspond to the visual structure of the document. This includes distinct elements for section titles, paragraphs, and tables, making the data easier to analyze and understand.

This tool is especially beneficial for Artificial Intelligence (AI) and Large Language Models (LLM) applications. It significantly improves the efficiency of data extraction and analysis in these fields.

[**Explore the Demo!**](https://sec-parser-output-visualizer.app.alphanome.dev/)

## Installation

You can install `sec-parser` using pip:

```bash
pip install sec-parser
```

## Usage

```python
import sec_parser as sp

tree = sp.parse_latest("10-K", ticker="AAPL")

# Show the general structure of the tree
print(tree.render())
```
Console output:
```
RootSectionElement: PART I — FINANCIAL INFORMATION
├── TitleElement: Item 1. Financial Statements
│ ├── TitleElement: CONDENSED CONSOLIDATED STATEMENTS OF OPERATIONS (U...
│ │ ├── TextElement: (In millions, except number of shares which are re...
│ │ ├── TableElement: ...
│ ...
```

# License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ version = "0.8.0"
description = "A robust and efficient parser for SEC filings, designed to extract and analyze financial data with ease."
authors = ["Alphanome.AI <[email protected]>"]
readme = "README.md"
repository = "https://github.com/alphanome-ai/sec-parser"


[tool.poetry.dependencies]
Expand Down

0 comments on commit 1967228

Please sign in to comment.