11 lines
466 B
Python
11 lines
466 B
Python
# how well the agent did on the challenges, the metrics calculation for the future if we're tracking specific tests
|
|
|
|
# POTENTIAL METRICS
|
|
# pass/fail - in the future could have a % metric of challenge completed, milestones achieved
|
|
# convergence - how long it took to get the result
|
|
# difficulty of the task - defined by previous comparing to runs against other agents
|
|
# consistency
|
|
# time passed
|
|
# budget used
|
|
# divergence (distractions not related to task at hand)
|