What Data Is Needed for Supervised Fine-Tuning (SFT)?
- 更新:2023-08-24 13:56:53
- 首发:2023-08-23 23:21:29
- 微调
- 3080
The article primarily discusses the types and quality of data required for Supervised Fine-Tuning (SFT). It covers the following aspects:
- Objectives of Supervised Fine-Tuning : Enhancing performance in specific tasks, domain adaptability, and the interpretability and controllability of the model, with an overarching goal to boost system robustness.
- Core Considerations : These include the diversity of data, avoiding treating SFT merely as data supplementation, appropriately incorporating few-shot learning and COT data, emphasizing data quality over quantity in SFT, and recognizing that increasing data volume without diversity brings diminished returns.
- Data Quality Requirements : These considerations touch on the length restrictions for questions and answers, the accuracy of answers, the selection of data based on industry requirements, the diversity of necessary NLP abilities, and the caution against too much vertical domain data.
- Specific Examples : The article provides both good and poor dataset examples to illustrate how to choose and evaluate data.
- Q&A Section : This part explains why including the ability to write code in SFT is essential, emphasizing its importance in improving reasoning and structured output abilities.
In summary, the article offers comprehensive guidance on how to conduct supervised fine-tuning, underlining the importance of data diversity and quality, and presents implementation strategies and examples to support these points.
感谢回复! Clang 在生成时沿用了 GCC 的版本号标识,我是不是可以理解为Clang 18.1.4生成时使用的就是GCC4.8,所以我后续使用gcc 9.4
gcov
就会有不兼容的问题抱歉,这块我也不太清楚,尝试寻求AI的帮助吧。
我在这个过程中遇到了各种问题- -,现在在UDC core: g_serial: couldn't find an available UDC卡住了,请问大佬有什么解决方案吗,还是说我前置的设置就错了呢,> 这个需求很特殊。是可以的,但是比较困难,需要修改驱动配置。
好思路呀!!
关于hex编辑器,网上没找到特别好用的(小白没办法),最后在vscode上扩展一搜hex,第一个安装一下就可以用vscode进行hex编译了