Social Search with Aardvark SDForum Meeting
The Search SIG of SDForum has a really interesting talk on March 30, 2010 about social search with the co-founder of Aardvark, Damon Horowitz. When I saw that this talk was coming up, I was immediately drawn to it for a few reasons. First, social search is a somewhat new idea, the definition of which differs depending on who you talk to or what you read. Second, I'm an active Aardvark user and I was interested in some of the inner workings of this very effective and powerful service.One of the key takeaways that I had from this meeting was that the idea of social search in the Aardvark world is to help the user ask a question and find a person to have a conversation with regarding that question. And it is not necessary indexing and search static social information in near-realtime. The search technology seem more geared toward understanding the questions and then finding an appropriate person in your network to answer based on numerous criteria including a good conversational match.
Below are the notes that I took during the meeting. Horowitz did not have any slides for this presentation and the format was informal yet still extremely informative. My notes will reflect the somewhat stream of consciousness format of the talk.
- Damon Horowitz, co-founder and CTO of Aardvark. Now Director of Engineering of the Aardvark team at Google.
- Started at with artificial intelligence and language processing
- Wanted to be able to conversationally interact with information
- Information extraction and conversational interaction with a data store
- Worked at Perspecta before it was acquired by Excite! @Home
- Went back to school to study language and philosophy and find out why was the language processing wasn't good enough
- As a result shifted from having the computer system help find the information (and have a full understanding) to making it easier for humans to interact with each other
- A little 2 years ago Aardvark started as a bunch of social information started to come online. There were also problems in addressing utility of search on mobile devices.
- It is also difficult to have high quality results for subjective searches, like "best restaurant" etc.
- Aardvark
- General start-up device, don't just commit to your first idea. They spent the first 6 months creating little demos and mockups and showed them to friends and colleagues.
- How it works:
- You send Aardvark a question and Aardvark finds somebody in your network who can answer your question and reach out to that person and see if they want to answer the question.
- When they prototyped this, discovered:
- People like answering questions. (Do a lot of user studies)
- Everybody likes to be helpful.
- Need more opportunities to be helpful.
- Works from latent good will.
- Aardvarks, reaches out and says something like "I think you might be able to answer this question."
- This is a form of flattery and is satisfying to the user.
- There is a very high response rate.
- Think of everything you know about and only a small fraction of that is online.
- Web search engines tap into the information available online.
- Social search is a way to tap into information in people's heads and the fuel for that is goodwill.
- How can you do this efficiently and at scale
- People like answering questions. (Do a lot of user studies)
- Other interesting discoveries along the way
- Social intermediation
- Aardvark is responsible for know what people know about
- Aardvark takes on the social burden of asking the questions
- You don't feel bad if you ignore Aardvark
- There are many reasons why you might not want to ask for you.
- Social burden, might not want to bother
- You may not know who to ask (there are thousands of people you are connected with but you don't know what they know about).
- Social intermediation
- Indexing
- There is all this information on the web to index
- There is all this information in people's heads about to index
- Instead of indexing the information, index the people
- From a question and from what we know about the answering people
- Instead of indexing the information, index the people
- Routing is both matching the questions to topics to people and also conversational matches (matching social affinity, vocabulary, wordiness, etc.)
- Use existing web information to help do the matching
- How it relates to other social searches.
- Personalized search on the web
- focusing on you and your search history
- Social search on the web
- extends that to you and your friends
- Twitter search
- Searching pre-existing content but based on your network
- Still operates on static content
- Bulletin board style
- Have a Q&A forum
- Instead of just anybody answering, then your social circle answers the question
- Status messages are included in this
- Actively reaching out and not waiting from them to browse and eliciting people for a 1-1 conversation
- Aardvark
- Personalized search on the web
- Where has Aardvark gone?
- After spending 6 months to figure out what to build. Spent 6 months doing user testing in a "wizard of oz" sort of way
- The interaction was simulated.
- Developed the user interaction by asking people to try things out and get the feedback
- When the feedback was consistently good, then launched internal alpha
- Last March had a private beta at SXSW 2009.
- IM
- iPhone
- Web (last)
- Wanted people to think of Aardvark as a real world contact and not a web site application
- Public release was in October of 2009.
- Acquired by Google in February
- After spending 6 months to figure out what to build. Spent 6 months doing user testing in a "wizard of oz" sort of way
- Financed Aardvark by
- Going to individual angel investors that might have good advice and made them "staked advisors." So you could gather good feedback without having them commit to a board. Could tap them for knowledge.
- Would this work inside of a firewall?
- There has been a lot of interest in something like this for an internal tool: Fortune 500, The White House, etc.
- Enterprise product was a good idea but wanted to leave that as a fall back
- Timing was more crucial on the consumer side so want after that first.
- How does the question classifier work?
- Built a research group quickly wtih 4 AI PhDs
- Built up a taxonomy of 3k topics and built training data and creating a classifying
- Light partial parsing. What are the important terms. Do semantic mappings.
- Also look at structured data sources for semantics
- Also look at unstructured data sources
- Run that in parallel and vote on "what that question is about."
- Consider semantic matches for the questions but also matching people conversationally.
- Business logic about spreading out the load of questions so you don't fatigue a user.
- Break down across platforms
- 40% IM
- 40% iPhone
- 40% web
- More than 100 because questions switch channels
- Scalability
- Technical scalability w/Google give them the tools to scale out
- Paradigm scalability w/Google gets it out to 100M people
- Aardvark has to keep in line with the current idea of what a social circle means and how far out in degrees of friendship can you go?
- Social affinity depends on the question:
- Do you want an expert?
- Do you want somebody similar to you to answer?
- Social affinity depends on the question:
- Aardvark
- AA means it floats to the top of the IM buddy list
- An animal: a trusty companion, personification ("I love Aardvark"), it's intelligent but not human intelligence.
- Social search is very user driven
- Enterprise and consumer is very different
- Social intermediation is very different
- The interaction would be different in the enterprise setting because you are in your job function and the other person is in their job function
- Domains/categoies of questions asked
- Similar to web seacrch
- Fewer navigational and reference queries
- Extened questions on health, relationships, etc.
- What is the satisfaction rate?
- Measured a few ways:
- In line feedback (was this a good answer)
- 70% good
- 15% okay
- 15% bad
- Implicit
- A good thing when the user responds to an answer
- Similar satisfaction rates 70/15/15
- The most important thing is your first interaction
- If your first question was answered quickly and by a friendly person, then you're coming back
- Otherwise you do not come back
- In line feedback (was this a good answer)
- Measured a few ways:
- Privacy
- Are the conversations being archived?
- Yes?
- Will you be able to search on that question/answer?
- Focus at Aardvark was creating a live conversation and not recycle answers.
- Conversations are private unless one or the other person decided to publish that information
- At Google, it will depend on how it fits into the Google ecosystem
- He is pleased the privacy conversation at Google
- Are the conversations being archived?
- What about a 1-to-Many interaction? (for say focus groups)
- Have seen people try to use Aardvark to get product feedback.
- Have had press come in and source reactions to a news event.
- Best with a dedicated app.
- Does it really work?
- Obviously not an active Aardvark user. ;-)
- Not trying to be a recommendation engine. Just try to put you in contact with the right people to have a conversation about it. You still need to make the determination if you think the answer is right.
- How much angel investing was raised? A-Round? Why Google? Revenue model
- Angel $1.5M
- Round A $6M
- Plan was for a B-Round (but were doing partnership meetings too and some of those turned into acquisition talks)
- Similar to Google culture (had former Google people). No interest in changing how the team worked. Wanted to keep the team the same.
- Lot's of good
- Monetization
- Ad based
- Sponsored answers
- "Is there a good real estate site for the SV?"
- Advertisers say I'd like to answer questions about.
- First, look on the social network
- Then Aardvark asks, "We have a sponsored answer, would you like to see it?"
- High conversion.
- Suggest that people use affiliate-style links (i.e. Amazon) in their answers.
- Very high click through rates.
- Hunch (another site)
- Same tier as searching online content that other people have created.
- Not really on the social search landscape.
- Aardvark Has a detector for English only questions
- This will change within Google
- Multiple answers
- People are happy with 2 answers
- A quick one and a longer more well thought out
- Users
- American urban hubs
- SF, NY
- American urban hubs
- How did you roll out to users? (Chicken & Egg problem)
- Tell the user, that we'll answer the question when we find an answer.
- Invite only to start with
- Use social vitality
- Option to boast about your answer
- Invitation were motivated
- You could refer a question to somebody not on Aardvark
- They could either just answer the question or join and answer
- Had a lot of press
- Social data
- Public Facebook API
- Tried out old LinkedIn
- Web-mail importers
- Sit on top of existing social data
- User research
- Qualitative
- Bring in people from Craigslist, show them features
- Query existing users
- Hijack conversation
- Manually, when a question comes in and try different prompts to 12 people and see the response and contact the users about the experience
- Manual
- Quantitative
- Qualitative
- Monetization
- Is there a conflict?
- Leads for sponsored questions are very high and the affiliates also bring in revenue
- Expand on the 6-months of user research
- Knew when they started: you can ask Aardvark a question and it will find an answer for you.
- Multitude of implementation
- Found that the impulsiveness of the IM was
- Had to find out what the expectations of a bot would be
- Prompts had to be determined
- Could tell people what they could do conversationally without having to teach the commands
- How it works
- Ask a question
- Aardvark tell you what it thinks the question is about, which is just the surface label
- User can
- Aardvark routing ranks the people who it think are best to answer
- What motivates the user to answer the question when there are no rewards?
- Did not build a reputation on purpose.
- You answer because it is gratifying to answer a question. I helped somebody and so somebody will help me.
- Ask each person 1 at a time with a 1 minute time out
- Problems with scams and spams?
- It's a lousy spamming system because you have a very small reach. The question only goes to a few people.
- There is a system to find suspicious "spam" questions,
- People become better question askers. More details the better.
- Technology
- Started off as a Rails shop because it was an easy way to get starter.
- Hosted on Amazon
- Had to do some refactoring to scale
- Twitter vs. Aardvark
- Twitter status messages are performative for entertainment
- It doesn't do anything to get the question answered. It may or may not get answered.
- Might be wasting your followers time.
- Different paradigm.
- List of un-answered and unsatisfactory answers
- Hard to find a pattern
- Dumb questions
- Esoteric
- Nobody on your social network to answer it
- How do you prevent people from getting worn out from answering questions?
- Answer economy problem.