大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测

这篇具有很好参考价值的文章主要介绍了大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

基于OpenCompass大模型评测

关于评测的三个问题Why/What/How

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

Why

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

What

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
有许多任务评测,包括垂直领域

How

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

包含客观评测和主观评测,其中主观评测分人工和模型来评估。

提示词工程

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

主流评测框架

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

OpenCompass 能力框架

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

  • 模型层
  • 能力层
  • 方法层
  • 工具层

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
支持丰富的模型

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
评测流水线设计,能切分多个独立执行的任务,最大化利用计算资源。
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
大模型能力对比结果输出

前言探索
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
探索性方向涵盖:

  • 多模态
  • 法律
  • 医生

挑战

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

实践

创建开发环境和准备数据集

大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
查看支持的数据集:
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

启动评测

客观评测

主要是run.py代码文件
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

  • datasets:指定数据集
  • hf-path:模型文件
  • tokenizer-path:tokenizer路径
  • max-seq-len:模型读入的最大长度
  • max-out-len:模型输出的最大长度,客观题设置一般较小
  • –debug:debug模式,打印出所有的过程
    大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
    大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass
主观评测

主要是eval_sbujective_alignbench.py文件修改,需要注意modelmax_out_len等处的修改。
大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测,# 书生·浦语大模型,大模型,LLM,评测,OpenCompass

最终结果:文章来源地址https://www.toymoban.com/news/detail-815017.html

python tools/list_configs.py internlm ceval 
20240122_153109
tabulate format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dataset                                         version    metric         mode      opencompass.models.huggingface.HuggingFace_model_repos_internlm2-chat-7b
----------------------------------------------  ---------  -------------  ------  --------------------------------------------------------------------------
ceval-computer_network                          db9ce2     accuracy       gen                                                                          47.37
ceval-operating_system                          1c2571     accuracy       gen                                                                          57.89
ceval-computer_architecture                     a74dad     accuracy       gen                                                                          38.1
ceval-college_programming                       4ca32a     accuracy       gen                                                                          18.92
ceval-college_physics                           963fa8     accuracy       gen                                                                           5.26
ceval-college_chemistry                         e78857     accuracy       gen                                                                           0
ceval-advanced_mathematics                      ce03e2     accuracy       gen                                                                           0
ceval-probability_and_statistics                65e812     accuracy       gen                                                                          11.11
ceval-discrete_mathematics                      e894ae     accuracy       gen                                                                          18.75
ceval-electrical_engineer                       ae42b9     accuracy       gen                                                                          18.92
ceval-metrology_engineer                        ee34ea     accuracy       gen                                                                          50
ceval-high_school_mathematics                   1dc5bf     accuracy       gen                                                                           0
ceval-high_school_physics                       adf25f     accuracy       gen                                                                          31.58
ceval-high_school_chemistry                     2ed27f     accuracy       gen                                                                          26.32
ceval-high_school_biology                       8e2b9a     accuracy       gen                                                                          26.32
ceval-middle_school_mathematics                 bee8d5     accuracy       gen                                                                          21.05
ceval-middle_school_biology                     86817c     accuracy       gen                                                                          66.67
ceval-middle_school_physics                     8accf6     accuracy       gen                                                                          52.63
ceval-middle_school_chemistry                   167a15     accuracy       gen                                                                          80
ceval-veterinary_medicine                       b4e08d     accuracy       gen                                                                          39.13
ceval-college_economics                         f3f4e6     accuracy       gen                                                                          29.09
ceval-business_administration                   c1614e     accuracy       gen                                                                          30.3
ceval-marxism                                   cf874c     accuracy       gen                                                                          84.21
ceval-mao_zedong_thought                        51c7a4     accuracy       gen                                                                          70.83
ceval-education_science                         591fee     accuracy       gen                                                                          62.07
ceval-teacher_qualification                     4e4ced     accuracy       gen                                                                          77.27
ceval-high_school_politics                      5c0de2     accuracy       gen                                                                          21.05
ceval-high_school_geography                     865461     accuracy       gen                                                                          47.37
ceval-middle_school_politics                    5be3e7     accuracy       gen                                                                          38.1
ceval-middle_school_geography                   8a63be     accuracy       gen                                                                          58.33
ceval-modern_chinese_history                    fc01af     accuracy       gen                                                                          65.22
ceval-ideological_and_moral_cultivation         a2aa4a     accuracy       gen                                                                          89.47
ceval-logic                                     f5b022     accuracy       gen                                                                          13.64
ceval-law                                       a110a1     accuracy       gen                                                                          37.5
ceval-chinese_language_and_literature           0f8b68     accuracy       gen                                                                          47.83
ceval-art_studies                               2a1300     accuracy       gen                                                                          66.67
ceval-professional_tour_guide                   4e673e     accuracy       gen                                                                          82.76
ceval-legal_professional                        ce8787     accuracy       gen                                                                          30.43
ceval-high_school_chinese                       315705     accuracy       gen                                                                          21.05
ceval-high_school_history                       7eb30a     accuracy       gen                                                                          75
ceval-middle_school_history                     48ab4a     accuracy       gen                                                                          68.18
ceval-civil_servant                             87d061     accuracy       gen                                                                          38.3
ceval-sports_science                            70f27b     accuracy       gen                                                                          63.16
ceval-plant_protection                          8941f9     accuracy       gen                                                                          68.18
ceval-basic_medicine                            c409d6     accuracy       gen                                                                          57.89
ceval-clinical_medicine                         49e82d     accuracy       gen                                                                          45.45
ceval-urban_and_rural_planner                   95b885     accuracy       gen                                                                          58.7
ceval-accountant                                002837     accuracy       gen                                                                          34.69
ceval-fire_engineer                             bc23f5     accuracy       gen                                                                          12.9
ceval-environmental_impact_assessment_engineer  c64e2d     accuracy       gen                                                                          38.71
ceval-tax_accountant                            3a5e3c     accuracy       gen                                                                          42.86
ceval-physician                                 6e277d     accuracy       gen                                                                          51.02
ceval-stem                                      -          naive_average  gen                                                                          30.5
ceval-social-science                            -          naive_average  gen                                                                          51.86
ceval-humanities                                -          naive_average  gen                                                                          54.34
ceval-other                                     -          naive_average  gen                                                                          46.53
ceval-hard                                      -          naive_average  gen                                                                          11.63
ceval                                           -          naive_average  gen                                                                          43.04
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

-------------------------------------------------------------------------------------------------------------------------------- THIS IS A DIVIDER --------------------------------------------------------------------------------------------------------------------------------

csv format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dataset,version,metric,mode,opencompass.models.huggingface.HuggingFace_model_repos_internlm2-chat-7b
ceval-computer_network,db9ce2,accuracy,gen,47.37
ceval-operating_system,1c2571,accuracy,gen,57.89
ceval-computer_architecture,a74dad,accuracy,gen,38.10
ceval-college_programming,4ca32a,accuracy,gen,18.92
ceval-college_physics,963fa8,accuracy,gen,5.26
ceval-college_chemistry,e78857,accuracy,gen,0.00
ceval-advanced_mathematics,ce03e2,accuracy,gen,0.00
ceval-probability_and_statistics,65e812,accuracy,gen,11.11
ceval-discrete_mathematics,e894ae,accuracy,gen,18.75
ceval-electrical_engineer,ae42b9,accuracy,gen,18.92
ceval-metrology_engineer,ee34ea,accuracy,gen,50.00
ceval-high_school_mathematics,1dc5bf,accuracy,gen,0.00
ceval-high_school_physics,adf25f,accuracy,gen,31.58
ceval-high_school_chemistry,2ed27f,accuracy,gen,26.32
ceval-high_school_biology,8e2b9a,accuracy,gen,26.32
ceval-middle_school_mathematics,bee8d5,accuracy,gen,21.05
ceval-middle_school_biology,86817c,accuracy,gen,66.67
ceval-middle_school_physics,8accf6,accuracy,gen,52.63
ceval-middle_school_chemistry,167a15,accuracy,gen,80.00
ceval-veterinary_medicine,b4e08d,accuracy,gen,39.13
ceval-college_economics,f3f4e6,accuracy,gen,29.09
ceval-business_administration,c1614e,accuracy,gen,30.30
ceval-marxism,cf874c,accuracy,gen,84.21
ceval-mao_zedong_thought,51c7a4,accuracy,gen,70.83
ceval-education_science,591fee,accuracy,gen,62.07
ceval-teacher_qualification,4e4ced,accuracy,gen,77.27
ceval-high_school_politics,5c0de2,accuracy,gen,21.05
ceval-high_school_geography,865461,accuracy,gen,47.37
ceval-middle_school_politics,5be3e7,accuracy,gen,38.10
ceval-middle_school_geography,8a63be,accuracy,gen,58.33
ceval-modern_chinese_history,fc01af,accuracy,gen,65.22
ceval-ideological_and_moral_cultivation,a2aa4a,accuracy,gen,89.47
ceval-logic,f5b022,accuracy,gen,13.64
ceval-law,a110a1,accuracy,gen,37.50
ceval-chinese_language_and_literature,0f8b68,accuracy,gen,47.83
ceval-art_studies,2a1300,accuracy,gen,66.67
ceval-professional_tour_guide,4e673e,accuracy,gen,82.76
ceval-legal_professional,ce8787,accuracy,gen,30.43
ceval-high_school_chinese,315705,accuracy,gen,21.05
ceval-high_school_history,7eb30a,accuracy,gen,75.00
ceval-middle_school_history,48ab4a,accuracy,gen,68.18
ceval-civil_servant,87d061,accuracy,gen,38.30
ceval-sports_science,70f27b,accuracy,gen,63.16
ceval-plant_protection,8941f9,accuracy,gen,68.18
ceval-basic_medicine,c409d6,accuracy,gen,57.89
ceval-clinical_medicine,49e82d,accuracy,gen,45.45
ceval-urban_and_rural_planner,95b885,accuracy,gen,58.70
ceval-accountant,002837,accuracy,gen,34.69
ceval-fire_engineer,bc23f5,accuracy,gen,12.90
ceval-environmental_impact_assessment_engineer,c64e2d,accuracy,gen,38.71
ceval-tax_accountant,3a5e3c,accuracy,gen,42.86
ceval-physician,6e277d,accuracy,gen,51.02
ceval-stem,-,naive_average,gen,30.50
ceval-social-science,-,naive_average,gen,51.86
ceval-humanities,-,naive_average,gen,54.34
ceval-other,-,naive_average,gen,46.53
ceval-hard,-,naive_average,gen,11.63
ceval,-,naive_average,gen,43.04
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

-------------------------------------------------------------------------------------------------------------------------------- THIS IS A DIVIDER --------------------------------------------------------------------------------------------------------------------------------

raw format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-------------------------------
Model: opencompass.models.huggingface.HuggingFace_model_repos_internlm2-chat-7b
ceval-computer_network: {'accuracy': 47.368421052631575}
ceval-operating_system: {'accuracy': 57.89473684210527}
ceval-computer_architecture: {'accuracy': 38.095238095238095}
ceval-college_programming: {'accuracy': 18.91891891891892}
ceval-college_physics: {'accuracy': 5.263157894736842}
ceval-college_chemistry: {'accuracy': 0.0}
ceval-advanced_mathematics: {'accuracy': 0.0}
ceval-probability_and_statistics: {'accuracy': 11.11111111111111}
ceval-discrete_mathematics: {'accuracy': 18.75}
ceval-electrical_engineer: {'accuracy': 18.91891891891892}
ceval-metrology_engineer: {'accuracy': 50.0}
ceval-high_school_mathematics: {'accuracy': 0.0}
ceval-high_school_physics: {'accuracy': 31.57894736842105}
ceval-high_school_chemistry: {'accuracy': 26.31578947368421}
ceval-high_school_biology: {'accuracy': 26.31578947368421}
ceval-middle_school_mathematics: {'accuracy': 21.052631578947366}
ceval-middle_school_biology: {'accuracy': 66.66666666666666}
ceval-middle_school_physics: {'accuracy': 52.63157894736842}
ceval-middle_school_chemistry: {'accuracy': 80.0}
ceval-veterinary_medicine: {'accuracy': 39.130434782608695}
ceval-college_economics: {'accuracy': 29.09090909090909}
ceval-business_administration: {'accuracy': 30.303030303030305}
ceval-marxism: {'accuracy': 84.21052631578947}
ceval-mao_zedong_thought: {'accuracy': 70.83333333333334}
ceval-education_science: {'accuracy': 62.06896551724138}
ceval-teacher_qualification: {'accuracy': 77.27272727272727}
ceval-high_school_politics: {'accuracy': 21.052631578947366}
ceval-high_school_geography: {'accuracy': 47.368421052631575}
ceval-middle_school_politics: {'accuracy': 38.095238095238095}
ceval-middle_school_geography: {'accuracy': 58.333333333333336}
ceval-modern_chinese_history: {'accuracy': 65.21739130434783}
ceval-ideological_and_moral_cultivation: {'accuracy': 89.47368421052632}
ceval-logic: {'accuracy': 13.636363636363635}
ceval-law: {'accuracy': 37.5}
ceval-chinese_language_and_literature: {'accuracy': 47.82608695652174}
ceval-art_studies: {'accuracy': 66.66666666666666}
ceval-professional_tour_guide: {'accuracy': 82.75862068965517}
ceval-legal_professional: {'accuracy': 30.434782608695656}
ceval-high_school_chinese: {'accuracy': 21.052631578947366}
ceval-high_school_history: {'accuracy': 75.0}
ceval-middle_school_history: {'accuracy': 68.18181818181817}
ceval-civil_servant: {'accuracy': 38.297872340425535}
ceval-sports_science: {'accuracy': 63.1578947368421}
ceval-plant_protection: {'accuracy': 68.18181818181817}
ceval-basic_medicine: {'accuracy': 57.89473684210527}
ceval-clinical_medicine: {'accuracy': 45.45454545454545}
ceval-urban_and_rural_planner: {'accuracy': 58.69565217391305}
ceval-accountant: {'accuracy': 34.69387755102041}
ceval-fire_engineer: {'accuracy': 12.903225806451612}
ceval-environmental_impact_assessment_engineer: {'accuracy': 38.70967741935484}
ceval-tax_accountant: {'accuracy': 42.857142857142854}
ceval-physician: {'accuracy': 51.02040816326531}
ceval-stem: {'ceval-computer_network': 47.368421052631575, 'ceval-operating_system': 57.89473684210527, 'ceval-computer_architecture': 38.095238095238095, 'ceval-college_programming': 18.91891891891892, 'ceval-college_physics': 5.263157894736842, 'ceval-college_chemistry': 0.0, 'ceval-advanced_mathematics': 0.0, 'ceval-probability_and_statistics': 11.11111111111111, 'ceval-discrete_mathematics': 18.75, 'ceval-electrical_engineer': 18.91891891891892, 'ceval-metrology_engineer': 50.0, 'ceval-high_school_mathematics': 0.0, 'ceval-high_school_physics': 31.57894736842105, 'ceval-high_school_chemistry': 26.31578947368421, 'ceval-high_school_biology': 26.31578947368421, 'ceval-middle_school_mathematics': 21.052631578947366, 'ceval-middle_school_biology': 66.66666666666666, 'ceval-middle_school_physics': 52.63157894736842, 'ceval-middle_school_chemistry': 80.0, 'ceval-veterinary_medicine': 39.130434782608695, 'naive_average': 30.50061705625207}
ceval-social-science: {'ceval-college_economics': 29.09090909090909, 'ceval-business_administration': 30.303030303030305, 'ceval-marxism': 84.21052631578947, 'ceval-mao_zedong_thought': 70.83333333333334, 'ceval-education_science': 62.06896551724138, 'ceval-teacher_qualification': 77.27272727272727, 'ceval-high_school_politics': 21.052631578947366, 'ceval-high_school_geography': 47.368421052631575, 'ceval-middle_school_politics': 38.095238095238095, 'ceval-middle_school_geography': 58.333333333333336, 'naive_average': 51.86291158931812}
ceval-humanities: {'ceval-modern_chinese_history': 65.21739130434783, 'ceval-ideological_and_moral_cultivation': 89.47368421052632, 'ceval-logic': 13.636363636363635, 'ceval-law': 37.5, 'ceval-chinese_language_and_literature': 47.82608695652174, 'ceval-art_studies': 66.66666666666666, 'ceval-professional_tour_guide': 82.75862068965517, 'ceval-legal_professional': 30.434782608695656, 'ceval-high_school_chinese': 21.052631578947366, 'ceval-high_school_history': 75.0, 'ceval-middle_school_history': 68.18181818181817, 'naive_average': 54.340731439412956}
ceval-other: {'ceval-civil_servant': 38.297872340425535, 'ceval-sports_science': 63.1578947368421, 'ceval-plant_protection': 68.18181818181817, 'ceval-basic_medicine': 57.89473684210527, 'ceval-clinical_medicine': 45.45454545454545, 'ceval-urban_and_rural_planner': 58.69565217391305, 'ceval-accountant': 34.69387755102041, 'ceval-fire_engineer': 12.903225806451612, 'ceval-environmental_impact_assessment_engineer': 38.70967741935484, 'ceval-tax_accountant': 42.857142857142854, 'ceval-physician': 51.02040816326531, 'naive_average': 46.533350138807684}
ceval-hard: {'ceval-advanced_mathematics': 0.0, 'ceval-discrete_mathematics': 18.75, 'ceval-probability_and_statistics': 11.11111111111111, 'ceval-college_chemistry': 0.0, 'ceval-college_physics': 5.263157894736842, 'ceval-high_school_mathematics': 0.0, 'ceval-high_school_chemistry': 26.31578947368421, 'ceval-high_school_physics': 31.57894736842105, 'naive_average': 11.627375730994151}
ceval: {'ceval-computer_network': 47.368421052631575, 'ceval-operating_system': 57.89473684210527, 'ceval-computer_architecture': 38.095238095238095, 'ceval-college_programming': 18.91891891891892, 'ceval-college_physics': 5.263157894736842, 'ceval-college_chemistry': 0.0, 'ceval-advanced_mathematics': 0.0, 'ceval-probability_and_statistics': 11.11111111111111, 'ceval-discrete_mathematics': 18.75, 'ceval-electrical_engineer': 18.91891891891892, 'ceval-metrology_engineer': 50.0, 'ceval-high_school_mathematics': 0.0, 'ceval-high_school_physics': 31.57894736842105, 'ceval-high_school_chemistry': 26.31578947368421, 'ceval-high_school_biology': 26.31578947368421, 'ceval-middle_school_mathematics': 21.052631578947366, 'ceval-middle_school_biology': 66.66666666666666, 'ceval-middle_school_physics': 52.63157894736842, 'ceval-middle_school_chemistry': 80.0, 'ceval-veterinary_medicine': 39.130434782608695, 'ceval-college_economics': 29.09090909090909, 'ceval-business_administration': 30.303030303030305, 'ceval-marxism': 84.21052631578947, 'ceval-mao_zedong_thought': 70.83333333333334, 'ceval-education_science': 62.06896551724138, 'ceval-teacher_qualification': 77.27272727272727, 'ceval-high_school_politics': 21.052631578947366, 'ceval-high_school_geography': 47.368421052631575, 'ceval-middle_school_politics': 38.095238095238095, 'ceval-middle_school_geography': 58.333333333333336, 'ceval-modern_chinese_history': 65.21739130434783, 'ceval-ideological_and_moral_cultivation': 89.47368421052632, 'ceval-logic': 13.636363636363635, 'ceval-law': 37.5, 'ceval-chinese_language_and_literature': 47.82608695652174, 'ceval-art_studies': 66.66666666666666, 'ceval-professional_tour_guide': 82.75862068965517, 'ceval-legal_professional': 30.434782608695656, 'ceval-high_school_chinese': 21.052631578947366, 'ceval-high_school_history': 75.0, 'ceval-middle_school_history': 68.18181818181817, 'ceval-civil_servant': 38.297872340425535, 'ceval-sports_science': 63.1578947368421, 'ceval-plant_protection': 68.18181818181817, 'ceval-basic_medicine': 57.89473684210527, 'ceval-clinical_medicine': 45.45454545454545, 'ceval-urban_and_rural_planner': 58.69565217391305, 'ceval-accountant': 34.69387755102041, 'ceval-fire_engineer': 12.903225806451612, 'ceval-environmental_impact_assessment_engineer': 38.70967741935484, 'ceval-tax_accountant': 42.857142857142854, 'ceval-physician': 51.02040816326531, 'naive_average': 43.043391430358646}
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

到了这里,关于大模型学习之书生·浦语大模型6——基于OpenCompass大模型评测的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 书生·浦语大模型开源体系(二)笔记

    💗💗💗欢迎来到我的博客,你将找到有关如何使用技术解决问题的文章,也会找到某个技术的学习路线。无论你是何种职业,我都希望我的博客对你有所帮助。最后不要忘记订阅我的博客以获取最新文章,也欢迎在文章下方留下你的评论和反馈。我期待着与你分享知识、互

    2024年04月09日
    浏览(89)
  • 书生·浦语大模型开源体系(四)笔记

    💗💗💗欢迎来到我的博客,你将找到有关如何使用技术解决问题的文章,也会找到某个技术的学习路线。无论你是何种职业,我都希望我的博客对你有所帮助。最后不要忘记订阅我的博客以获取最新文章,也欢迎在文章下方留下你的评论和反馈。我期待着与你分享知识、互

    2024年04月28日
    浏览(31)
  • 书生·浦语大模型--第二节课笔记

    大模型 定义:参数量巨大、拥有庞大计算能力和参数规模的模型 特点:大量数据训练、数十亿甚至千亿数据、惊人性能 InternLM系列 InternLM:轻量级训练框架 Lagent:轻量级、开源的基于大语言模型得到智能体框架,将大语言模型转变为多种智能体 浦语灵笔:视觉语言大模型,

    2024年01月22日
    浏览(43)
  • 书生·浦语大模型实战营-学习笔记4

    常见的两种微调策略:增量预训练、指令跟随 指令跟随微调 数据是一问一答的形式 对话模板构建 每个开源模型使用的对话模板都不相同 指令微调原理: 由于只有答案部分是我们期望模型来进行回答的内容,所以我们只对答案部分进行损失的计算 增量预训练微调 数据都是

    2024年01月22日
    浏览(45)
  • 书生·浦语大模型实战营-第四课笔记

    期待已久的微调课 增量预训练和指令跟随是两种微调模式,即两种微调策略。   1)增量预训练 投喂新的领域知识即可,例如书籍、文章、代码 2)指令跟随 采用高质量对话和问答数据进行训练 两者是微调的方法,即算法。 xtuner是一种微调框架。

    2024年02月21日
    浏览(49)
  • 书生·浦语大模型全链路开源体系-第6课

    为了推动大模型在更多行业落地应用,让开发人员更高效地学习大模型的开发与应用,上海人工智能实验室重磅推出书生·浦语大模型实战营,为开发人员提供大模型学习和开发实践的平台。 本文是书生·浦语大模型全链路开源体系-第6课的课程实战。 InternLM项目地址 https:/

    2024年04月22日
    浏览(43)
  • 书生·浦语大模型全链路开源体系-第2课

    为了推动大模型在更多行业落地应用,让开发人员更高效地学习大模型的开发与应用,上海人工智能实验室重磅推出书生·浦语大模型实战营,为开发人员提供大模型学习和开发实践的平台。 本文是书生·浦语大模型全链路开源体系-第2课的课程实战。 InternLM项目地址 https:/

    2024年04月13日
    浏览(50)
  • 【第1节】书生·浦语大模型全链路开源开放体系

    书生·浦语 InternLM介绍 InternLM 是在过万亿 token 数据上训练的多语千亿参数基座模型。通过多阶段的渐进式训练,InternLM 基座模型具有较高的知识水平,在中英文阅读理解、推理任务等需要较强思维能力的场景下性能优秀,在多种面向人类设计的综合性考试中表现突出。在此基

    2024年04月22日
    浏览(48)
  • 【书生·浦语大模型实战】“PDF阅读小助手”学习笔记

    《新版本Lmdeploy量化手册与评测》 项目主页:【tcexeexe / pdf阅读小助手】 在InternStudio平台中选择 A100 (1/4) 的配置,镜像选择 Cuda11.7-conda ,可以选择已有的开发机 langchain ; Note: /home/tcexeexe/data/model/sentence-transformer :此路径来自于make_knowledge_repository.py 以上脚本会生成数据库文

    2024年01月24日
    浏览(45)
  • 书生·浦语大模型全链路开源体系【大模型第2课-笔记】

    1.1 什么是大模型?   大模型通常指的是机器学习或人工智能领域中参数数量巨大、拥有庞大计算能力和参数规模的模型。这些模型利用大量数据进行训练,并且拥有数十亿甚至数千亿个参数。大模型的出现和发展得益于增长的数据量、计算能力的提升以及算法优化等因素

    2024年01月19日
    浏览(94)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包