본문 바로가기
Time Series/Finance

Finance Data and LMs for NLP

by 아르카눔 2025. 1. 15.

금융 자연어 데이터는 어떤것들이 있는지 간략하게 정리해보았다.

 

Data

 

Financial Corpus

 

SEC EDGAR

Seeking Alpha

Investext

 

Reuters

Bloomberg

Investopedia

 

General Corpus

Wikipedia

C4

BookCorpus

 

 

 

 

FPB

FiQA-SA

AnalystTone

FiQA-QA

FinSBD19

Headline

FinSBD21

onvFinQA

FIN

StockNet

CIKM18

BigData22

FOMC

FinQA

ECTSum

FinRED

 

 

 

FinGPT의 데이터 FinNLP: 링크 

 

 

Korean Financial Data

BC Card Finance Kor 데이터 (링크)

KoFinanceLLM open 데이터 (링크)

https://huggingface.co/datasets?language=language:ko&sort=trending&search=financ

 

 

 

Financial LMs

 

 

 

 

Models

 

FinBERT

FLANG

FinMA

InverstLM

BollmbergGPT

FinGPT

 

 

 

References:

A Survey of Large Language Models in Finance (FinLLMs) - Jean Lee, Nicholas Stevens,  Soyeon Caren Han, Minseok Song (링크)

https://huggingface.co/datasets?sort=trending&search=finance