My quick notes for the “Building the Software 2.0 Stack” talk by Andrej Karpathy at Train AI 2018 conference - machine learning for a human world.

Training Datasets#

The part around building and managing datasets is very interesting. We don’t get to hear about these problems often.

Software 2.0 Integrated Development Enviroments (IDEs)#

What IDEs including code editors will look like?

  • Show a full inventory or statistics of the current dataset.
  • Create or edit annotation layers for any datapoint.
  • Flag, escalate & resolve discrepancies in multiple labels.
  • Flag and escalate datapoints that are likely to be mislabeled.
  • Display predictions on an arbirary set of test datapoints.
  • Autosuggest datapoints that should be labeled.