研究実績の概要 |
In 2023, we developed an interpretation method for analyzing how well LMs represent a specific kind of structured knowledge, namely an numeric properties such as a person's year of birth or a city's population. This method allows directly analyzing and manipulating the internal state of LMs in order to control its behavior when generating output involving numeric properties. In the broader context of interpretability, transparency, and explainability of LMs, this method contributes to a improved understanding of how LMs encode structured knowledge.
|