Our unsupervised learning model

Capturing, refining and providing valuable output through fashion digital interfaces

Background: Chicisimo built the interfaces and infrastructure to help people find what outfit to wear, and what garment to buy. We focused on (i) creating input interfaces that allow people to express a need; on (ii) building the data system that understands and responds to that need, and; on (iii) creating an environment where users are incentivized to provide data, specially descriptors of outfits and garments, to keep the learning loop going. 

Process: We approached the above opportunity by building a small team of product people and engineers with a common understanding of the what-to-wear problem, seeking to build interfaces with the following characteristics:

  1. The interfaces we build must allow people to express a what-to-wear need, from objective (type of garment, color, fabric, print…) and subjective (styles, occasions, age…) points of view; 
  2. The development of interfaces and data infrastructure influence each other;
  3. For 1 and 2 to happen, we have to uncover the organic data structures that arise from said interface;
  4. While users provide input and respond to output, the entire experience encourages them to provide further input. Input comes in the form of (i) expressed taste, such as expressing what type of style they like, expressing what occasions are relevant to them in terms of clothing, what type of clothing they like or own. Input also comes in the form of (ii) categorizing content, such as grouping outfits into described albums, tagging outfits, establishing correlations among items, providing relevant queries, etc. This input supplies our team with feedback to build more effective interfaces and accumulates data so we can respond to more complex inputs with it;
  5. The data structures are refined through unsupervised learning and collaborative filtering, by allowing us to find new data structures and ways to aggregate the data which aids in the consequent interface decisions. A learning loop is created.

As a result of the above, our product experience leads people to  create content for us, search for it and interact with it. It is this community who, by creating content, categorizing it, and searching for it, allows improving our understanding of the data and creating new algorithms and experiences for the community. 

Automatic curation. Our system encourages content creators differently based on the content they create: As not all users are as good as tagging or describing their content, we’ve defined an approach to easily classify users based on their quality content, and we give a different weight to the information provided by different users. Some users are featured in prominent places through our consumer apps, creating a positive feedback loop where they get rewarded for posting more and better content. We also monitor outfits with high engagement and add them to this list of outfits if they meet certain criteria.

Future iterations: There are several ways in which our process can be improved, and we have other sources of data we can use to improve or create new experiences:

  • We are in the early days of allowing people to describe/categorize the clothes in their closets. We are in the early days because we have not needed to be at a later stage. This will change;
  • We are not using the style of outfits people like, when returning results to their queries; 
  • We are not using the closet data to help people shop for new products. From the point of view of our data structure, a closet is equal to a catalogue, so obtaining shoppable clothes that correlate with items in a user closet something we do easily; 
  • We also know we can have the Graph attach further descriptors to certain outfits. We’ll expand the collaborative filtering by adding the descriptors users are querying for when opening an outfit or product. For example, if an outfit shows up in a query including “red pants”, and is constantly being opened, there is a high probability that the look is about “red pants” too. Albums names could provide additional descriptors to outfits, and it has not been a priority for us; 
  • We are not using contextual data to uncover new possibilities, or to influence the output, like geolocation, time of the year, weather.