...

Masahiro Suzuki

Nikko Asset Management Co., Ltd.
Quant Analyst

Ph.D. student in Izumi Lab.,
Department of Systems Innovation,
School of Engineering, The University of Tokyo

Mail : msuzuki [at] g.ecc.u-tokyo.ac.jp
: 0000-0001-8519-5617
: scholar.google.com/citations?user=_-8tzX0AAAAJ
: researchmap.jp/masahiro-suzuki
: Masahiro-Suzuki-11
: github.com/retarfi
: linkedin.com/in/msuzuki7/

Self-introduction

Research Area: Text Mining, Natural Language Processing (Mainly in the financial domain, some in agriculture and medicine)

A Member of: IEEE, the Association for Natural Language Processing, the Japanese Society for Artificial Intelligence

Biography

2022/10 - :   Studying at Izumi Lab., Department of Systems Innovation, School of Engineering, The University of Tokyo (Ph.D Program)

2022/04 - :   Working at Nikko Asset Management Co., Ltd.

2020/04 - 2022/03 :   Studied at Izumi Lab., Department of Systems Innovation, School of Engineering, The University of Tokyo (Master Program)

2019/05 - 2020/03 :   Researched at Izumi Lab., Faculty of Engineering

2018/04 - 2020/03 :   Studied at Systems Design & Management Course, Department of Systems Innovation, Faculty of Engineering

2016/04 - 2018/03 :   Studied at Natural Sciences I, College of Arts and Sciences (Junior Division)

2015/04 - 2016/03 :   Department of Industrial and Systems Engineering, Faculty of Science and Engineering, Keio University

2009/04 - 2015/03 :   Senior & Junior High School at Komaba, University of Tsukuba

1996/09 :   Born in Tokyo, Japan

Public resources

  • Economy Watchers Survey Dataset (Hugging Face Datasets)
  • Japanese DeBERTaV2 Model (base / small)
  • Japanese Large Language Model Project
    Japanese datasets and tuning models are available.
    detail (in Japanese)
  • ACL anthology Japanese abstract
    Automatic translation of abstracts of articles on ACL anthology into Japanese using ChatGPT.
  • Pre-training Language Models for Japanese (github.com/retarfi/language-pretraining)
    Pre-training models for BERT and ELECTRA, using the Japanese Wikipedia and financial domains as corpora. Wikipedia and financial models are available in the Transformers natural language processing library, respectively (huggingface.co/izumi-lab).
  • jptranstokenizer: Japanese Tokenzier for transformers (github.com/retarfi/jptranstokenizer)
    Japanese tokenizer compatible with HuggingFace library. Juman++, sudachi and spaCy LUW are available as main-word tokenizers (MeCab is also available). Wordpiece and sentencepiece are available as subword tokenizers. You can load easily a trained tokenizer with Juman++ and sentencepiece.
    PyPI

Papers

Publication (Refreed)

  1. FinDeBERTaV2: Word-Segmentation-Free Pre-trained Language Model for Finance (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, Masanori Hirano, and Kiyoshi Izumi.
    Transactions of the Japanese Society for Artificial Intelligence, 2024.
    J-STAGEbib
  2. Development and analysis of medical instruction-tuning for Japanese large language models
    Issey Sukeda, Masahiro Suzuki, Hiroki Sakaji, and Satoshi Kodera.
    AIH, 2024.
    AccSciencebib
  3. Constructing and analyzing domain-specific language model for financial text mining
    Masahiro Suzuki, Hiroki Sakaji, Masanori Hirano, and Kiyoshi Izumi.
    Information Processing & Management, 2023.
    Impact Factor: 8.6, Q1 Journal as of 2022
    ScienceDirectpaperdetail
  4. Forecasting Stock Price Trends by Analyzing Economic Reports With Analyst Profiles
    Masahiro Suzuki, Hiroki Sakaji, Kiyoshi Izumi, and Yasushi Ishikawa.
    Frontiers in Artificial Intelligence, 2022.
    Impact Factor (2022): 4.0
    Frontiersbib
  5. Forecasting Net Income Estimate and Stock Price Using Text Mining from Economic Reports
    Masahiro Suzuki, Hiroki Sakaji, Kiyoshi Izumi, and Yasushi Ishikawa.
    Information, 2020.
    Selected as Cover Story
    MDPIbib

International Conference (Refreed)

  1. Refined and Segmented Price Sentiment Indices from Survey Comments
    Masahiro Suzuki, and Hiroki Sakaji.
    2024 IEEE International Conference on Big Data (Big Data), 2024.
    Accepted
    arXivbib
  2. Is ChatGPT the Future of Causal Text Mining? A Comprehensive Evaluation and Analysis
    Takehiro Takayanagi, Masahiro Suzuki, Ryotaro Kobayashi, Hiroki Sakaji, and Kiyoshi Izumi.
    2024 IEEE International Conference on Big Data (Big Data), 2024.
    Accepted
    arXivbib
  3. Sentiment-driven Stock Selection in Japan using Language Models
    Masahiro Suzuki, and Hiroki Sakaji.
    2024 IEEE Symposium on Computational Intelligence for Financial Engineering and Economics (CIFEr), 2024.
    Accepted
  4. JaFIn: Japanese Financial Instruction Dataset
    Kota Tanabe, Masahiro Suzuki, Hiroki Sakaji, and Itsuki Noda.
    2024 IEEE Symposium on Computational Intelligence for Financial Engineering and Economics (CIFEr), 2024.
    Accepted
    arXivbib
  5. JMedLoRA:Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning
    Issey Sukeda, Masahiro Suzuki, Hiroki Sakaji, and Satoshi Kodera.
    Deep Generative Models for Health Workshop NeurIPS 2023, 2023.
    OpenReviewarXivbib
  6. From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models
    Masahiro Suzuki, Masanori Hirano, and Hiroki Sakaji.
    2023 IEEE International Conference on Big Data (Big Data), 2023.
    IEEEarXivSSRNbib
  7. llm-japanese-dataset v0: Construction of Japanese Chat Dataset for Large Language Models and its Methodology
    Masanori Hirano, Masahiro Suzuki, and Hiroki Sakaji.
    The 12th International Workshop on Web Services and Social Media (WSSM-2023) in The 26th International Conference on Network-Based Information Systems (NBiS-2023), 2023.
    Springer LinkarXivSSRNbib
  8. Gradual Further Pre-training Architecture for Economics/Finance Domain Adaptation of Language Model
    Hiroki Sakaji, Masahiro Suzuki, Kiyoshi Izumi, and Hiroyuki Mitsugi.
    2022 IEEE International Conference on Big Data (Big Data), 2022.
    IEEEpaperdetail
  9. Constructing and analyzing domain-specific language model for financial text mining
    Masahiro Suzuki, Hiroki Sakaji, Masanori Hirano, and Kiyoshi Izumi.
    Information Processing and Management Conference, 2022.
  10. Market Trend Analysis Using Polarity Index Generated from Analyst Reports
    Rei Taguchi, Hikaru Watanabe, Masanori Hirano, Masahiro Suzuki, Hiroki Sakaji, Kiyoshi Izumi, and Kenji Hiramatsu.
    2021 IEEE International Conference on Big Data (Big Data), 2021.
    IEEEbib
  11. Stock Price Analysis Using Combination of Analyst Reports and Several Document
    Masahiro Suzuki, Toshiya Katagi, Hiroki Sakaji, Kiyoshi Izumi, and Yasushi Ishikawa.
    2019 International Conference on Data Mining Workshops (ICDMW), 2019.
    Best Paper Award
    IEEEpaperdetail

Domestic Conference (Non-Refreed) / Other

  1. Development of Dialogue System to Induce Reconcilation of Social Values (in Japanese)
    Ryotaro Kobayashi, Takehiro Takayanagi, Masahiro Suzuki, Yukiko Ogura, and Hiroki Sakaji.
    19th Symposium of Young Researcher Association for NLP Studies (YANS), 2024.
  2. EWS: the Economic Watcher Survey Datasets and Tasks for the Financial and Economic Domain (in Japanese)
    Masahiro Suzuki, and Hiroki Sakaji.
    IEICE Tech. Rep., 2024.
    IEICEJxivbib
  3. JaFIn: Japanese Financial Instruction Dataset (in Japanese)
    Kota Tanabe, Masahiro Suzuki, Hiroki Sakaji, and Itsuki Noda.
    IEICE Tech. Rep., 2024.
    IEICEpaperdetail
  4. Large Language Models in the Financial Domain (in Japanese)
    Kiyoshi Izumi, Yuri Murayama, Masahiro Suzuki, Takehiro Takayanagi, Shota Nakasuji, Ryotaro Kobayashi, and Makishi Yamamoto.
    The Annual Conference of JSAI, 2024.
    confitbib
  5. Stock Selection Attempt using Sentiment of Japan Company Handbook (in Japanese)
    Masahiro Suzuki.
    Proceedings of the Annual Conference of JSAI, 2024.
    J-STAGEbib
  6. Impact Analysis on Stock Prices by Information Propagation in Social Networks using Artificial Market Simulations (in Japanese)
    Miyuki Matsumoto, Hashimoto Ryuji, Masahiro Suzuki, Yuri Murayama, and Kiyoshi Izumi.
    Proceedings of the Annual Conference of JSAI, 2024.
    J-STAGEbib
  7. Development of a Dataset Representing Agricultural Product Prices for Generative AI of Agricultural Extension Specialist. (in Japanese)
    Akio Kobayashi, Hiroki Sakaji, Tetsuo Katsuragi, Shotaro Mori, Akira Hashimoto, Masahiro Suzuki, and Takahiro Kawamura.
    Proceedings of the Annual Conference of JSAI, 2024.
    J-STAGEbib
  8. JMedLoRA: Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning (in Japanese)
    Issey Sukeda, Masahiro Suzuki, Hiroki Sakaji, and Satoshi Kodera.
    The Thirtieth Annual Meeting of the Association for Natural Language Processing, 2024.
    paperdetail
  9. Language Model Construction and Domain Adaptation using Multiple Nodes (in Japanese)
    Masahiro Suzuki, and Hiroki Sakaji.
    Intelligent Computing Systems (ICS), 2024.
    IPSJpaperdetail
  10. LoRA Tuning Conversational Japanese Large Language Models using Japanese Instruction Dataset (in Japanese)
    Masahiro Suzuki, Masanori Hirano, and Hiroki Sakaji.
    IEICE Tech. Rep., 2023.
    IEICEJxivbib
  11. llm-japanese-dataset v0: Construction of Japanese Chat Dataset for Large Language Models (in Japanese)
    Masanori Hirano, Masahiro Suzuki, and Hiroki Sakaji.
    Special Interest Group on Natural Language Processing, Information Processing Society of Japan, 2023.
    Young Research Award (Co-author)
    SIG-NLJxivbib
  12. Construction of Japanese Instruction Dataset and its Application to Tuning of Large-scale Language Models (in Japanese)
    Masahiro Suzuki, Masanori Hirano, and Hiroki Sakaji.
    18th Symposium of Young Researcher Association for NLP Studies (YANS), 2023.
    Honorable Mention Award and ELYZA Award (Sponsor Award)
  13. Causal Text Mining in the Era of Large Language Modeling: A Reality Check (in Japanese)
    Takehiro Takayanagi, Ryotaro Kobayashi, Masahiro Suzuki, Hiroki Sakaji, and Kiyoshi Izumi.
    18th Symposium of Young Researcher Association for NLP Studies (YANS), 2023.
  14. Proposing task to extract differences from time series financial documents (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, and Kiyoshi Izumi.
    Proceedings of the Annual Conference of JSAI, 2023.
    J-STAGEbib
  15. Performance Evaluation of Japanese Pre-trained Language Models with Different Word Segmentation Systems (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, and Kiyoshi Izumi.
    29th Annual Meeting of the Association for Natural Language Processing (NLP), 2023.
    paperdetail
  16. Stock Price Trend Forecast using Multiple Timeseries Analyst Reports (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, Kiyoshi Izumi, and Yasushi Ishikawa.
    Workshop of Social System and Information Technology (WSSIT2022), 2022.
    paperdetail
  17. Construction and Validation of a Pre-Training and Additional Pre-Training Financial Language Model (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, Masanori Hirano, and Kiyoshi Izumi.
    Proceedings of JSAI Special Interest Group on Financial Infomatics (SIG-FIN) 28, 2022.
    paperdetail
  18. Construction and Validation of Additional Pre-Training Language Model using Financial Documents (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, Kiyoshi Izumi, and Yasushi Ishikawa.
    28th Annual Meeting of the Association for Natural Language Processing (NLP), 2022.
    paperdetail
  19. Market Trend Analysis Using Polarity Index Generated from Analyst Reports (in Japanese)
    Rei Taguchi, Hikaru Watanabe, Masanori Hirano, Masahiro Suzuki, Hiroki Sakaji, Kiyosho Izumi, and Kenji Hiramatsu.
    Proceedings of JSAI Special Interest Group on Financial Infomatics (SIG-FIN) 27, 2021.
    J-STAGESIG-FINpaperdetail
  20. Construction and Validation of a Pre-Trained Language Model Using Financial Documents (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, Hirano Masanori, and Kiyoshi Izumi.
    Proceedings of JSAI Special Interest Group on Financial Infomatics (SIG-FIN) 27, 2021.
    J-STAGESIG-FINpaperdetail
  21. Performance Validation of Pre-Trained BERT in the Financial Domain (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, Hirano Masanori, and Kiyoshi Izumi.
    IEICE Tech. Rep., 2021.
    IEICEpaperdetail
  22. Construction of Japanese Instruction Dataset and its Application to Tuning of Large-scale Language Models (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, Hirano Masanori, and Kiyoshi Izumi.
    16th Symposium of Young Researcher Association for NLP Studies (YANS), 2021.
  23. Stock Price Movement Forecast from Analyst Reports by Text Mining (in Japanese)
    Masahiro Suzuki, Toshiya Katagi, Hiroki Sakaji, Kiyoshi Izumi, and Yasushi Ishikawa.
    26th Annual Meeting of the Association for Natural Language Processing (NLP), 2020.
    paperdetail
  24. Net Income Forecast from Analyst Reports by Text Mining (in Japanese)
    Masahiro Suzuki, Hiroki Sakaji, Kiyoshi Izumi, Hiroyasu Matsushima, and Yasushi Ishikawa.
    Proceedings of the Annual Conference of JSAI, 2020.
    JSTAGEbib

Preprint

  1. Interactive DualChecker for Mitigating Hallucinations in Distilling Large Language Models
    Meiyun Wang, Masahiro Suzuki, Hiroki Sakaji, and Kiyoshi Izumi.
    arXivbib
  2. Economy Watchers Survey provides Datasets and Tasks for Japanese Financial Domain
    Masahiro Suzuki, and Hiroki Sakaji.
    arXivbib

Scholarship and Awards

Scholarship
  • 2020/04 :   TOYOTA/Dwango AI Scholarship (1 year: 1,200,000 yen / Approx. 11,000 USD)
  • 2020/04 :   JEES / SoftBank AI Human Resources Development Scholarship (1 year: 1,000,000 yen / Approx. 9,000 USD)
Awards

Academic Activities

Others

  • The University of Tokyo Golf Team website 2018 production