Personal Reflections

For 2015, I’m Trying To Give Up Antidepressants

It’s been a while since I have spoken about my mental health issues publicly. And I think it’s time to talk because I’m trying to stop taking antidepressants…and I don’t know yet whether I’ll succeed.


My last blister of antidepressants
My last blister of antidepressants

I stopped taking my antidepressants in the first week of December 2014. Not suddenly…I went on a proper action plan for slowly weaning myself off them.

My current bout of depression started in September 2012, which makes it just over two years now. For over a year, I had been on the highest dose of my antidepressant medication – citalapram, 40mg / day – as my course of treatment. Additionally, I am still currently on pregabalin – a drug more commonly used to treat nerve pain in chronic cases – to control my Generalised Anxiety Disorder (GAD).

Seven months ago, I started feeling better…and that depression was no longer my problem. I felt a modicum of control in my life, through the consistent treatment that I was getting from my psychiatrist and my therapist. Sure, I still had “issues”…but the world didn’t feel permanently dull and dim, like it did in my earlier phases of full-blown depression. So, I told them that I was feeling better and would like to begin proceeding with ceasing antidepressants. I began this process in June 2014.

I had good reason to be worried about “doing this the right way” because antidepressant discontinuation syndrome is a very real thing. In the alphabet soup of mental health disorders, this one’s bad because sudden or quick cessation of SSRI antidepressants (like citalopram) can cause electric shocks in your brain (“brain zaps”; which I’ve had to deal with earlier), sensory disturbances, insomnia and a whole lot more. The cause for this, like many other mental health problems, remains unknown.

It’s the possibility of insomnia that I was most worried about. I knew that worsening insomnia would flare up my anxiety disorder, because it usually does. I wanted this badly. There was a part of me which wanted to know that I had won. And I wanted this from a medical professional because I worship objective feedback.

I desperately wanted closure on this chapter of my life that had gone on for the past two years.

I needed this.

The road to the end of the tunnel wasn’t easy. Technically, it’s possible to taper off in a couple of months. It dragged out to six months for me because dose adjustments sometimes took longer to get used to. I also had to pause on cutting back dosage during weeks when I had to readjust the dosage of my anti-anxiety medication upwards to counteract issues are they cropped up. But, by mid-November 2014, I was on the lowest “starter” dose of citalopram and I got the go-ahead from my psychiatrist to stop once I finished with my last batch.

At first, everything seemed to be going okay. I stopped taking citalopram and felt fine. While it was too early to celebrate, I was secretly doing victory laps on the inside.

And then, a few days after I stopped taking it, everything went spectacularly shit.

The worst part was the first week or so. I found myself hitting random brick walls of grief which came out of nowhere. I found myself crying uncontrollably, something which always wrecks me because I hate crying. What shook me to my core was that all of this was happening without any reason: not a bad day at work, not fights with people, not a sad song or a book or a film or a TV show. I always look for rational explanations for life events, and especially when it comes to emotional matters I find that I can’t deal when emotions come out of nowhere due to the loss of a sense of control that evokes.

It scared me because after six months of slow progress, I was suddenly facing what happened at the height of my depression. A sudden and unexpected regression that seemed like an unravelling of everything that I worked towards to make go away.

The answer, of course, was that I needed to stick to the path of staying off my medication a tad longer to see if the situation improved.

I always talk about the importance of seeking professional medical help. When I detect things going wrong, when I notice the early-warning signs, I make an active effort to set up an appointment with a doctor as soon as possible. I exercise, practise meditation, eat right, ensure that I have an active social calendar so that I don’t stew in my own thoughts, stick to my medication religiously.

What bothered me is that I did everything right. What bothered me is that I have ended courses of antidepressant treatment in the past when my depression went away without any of these problems.

Faced with this supposed regression, I started questioning whether things really have gotten so bad that I need to stay on them longer. And how much longer I needed to stay on them. And whether there is any light at the end of the tunnel for me. Whether this will affect me for the rest of my life.

At this point, I found myself blaming one person: me. I was the one who thought that I was better. I was the one who asked to be taken off antidepressants. I have been the one pushing myself all these months saying “just a few more days / weeks / months until this is over”. I am the one who asked the training wheels to be taken off. Sure, my doctors agreed with me but they did so on my insistence. I did so because I genuinely thought that I was better off and by all indications things did seem to be going that way. Now, I was on the verge of being proven wrong.

I know better. I know that it’s not “my fault” that I have depression and the right way to go about it is to look at it like any other disease. But that’s not really true, is it?

It’s easy to find strength to fight depression when you know there’s an end. It might be a few months, a year, two years but if you believe this is something that you can get over then it makes it easier to handle the shit cards life has dealt you. That if you fall down, you just need to pick yourself up again and try again.

Maybe this was my depression-like symptoms talking, but through that week of hell I felt that if these symptoms weren’t a passing phenomenon, that if I needed to go back on antidepressants…I’d lost. I felt that I couldn’t pick myself up again – not even “just one more time”. I may have misjudged where the finish line was, but I felt that knowing that, I couldn’t find that last ounce of energy to push myself even a little longer. I felt like I’d reached my limit. As a friend said, I didn’t have the strength to let the loss gut me and carry on.

I had another scare during that period. I was still pushing myself to keep my social life active, and like anyone else, I enjoy drinks with company. Except…on one occasion I blacked out after four glasses of wine.

Now, the interaction between alcohol and antidepressants is a common one – albeit one I’ve never faced. It’s also an interaction with my anti-anxiety medication – pregabalin – but I’d never seriously considered this affected me until that point. I’ve had moments in the past six months when I’ve blacked out – which I discussed with my psychiatrist as well – but the consensus we both reached, and what I believed as well, was that was probably just due to the amount I’d drunk on those occasions (a lot) and the rate at which I had drunk (very quickly).

Blackout after four glasses of wine was new and worrying, because it’s well below my normal tolerance. So I looked into it, and discussed it, and what I found was worrying: it probably wasn’t just a “normal” drunken blackout, but central nervous system (CNS) depression (no relation to normal depression). CNS depression can result in decreased breathing, decreased heart rate, and loss of consciousness with very serious implications leading right up to coma or death. And I wouldn’t even know what was happening because I’d written it off as bog-standard drunkenness. Some people face this side-effect when taking my particular anti-anxiety pills and alcohol. Apparently I was one of those “some people”.

It also made me realise that through that period, I was drinking quite consistently. Never excessively, just a couple of bottles of beer every day and well within the recommended daily intake limit – but I was still doing it. Beer doesn’t affect me the same way wine does (I’ve entirely stayed away from spirits and liqueurs during the weaning-off period; and I don’t drink whiskey) and hasn’t caused CNS depression-like blackouts. I was swapping out juices, colas, milk for bottles of beer in my fridge. And while by no means what I was doing could be classified as alcoholism (I think; it’s not like I was polishing off bottles of wine daily), I still needed to – and did – put an end to it.


All of that was 3-4 weeks ago, which brings me to now, in 2015. During these weeks, I’ve continued to have major issues with insomnia. On most days, I find it unable to sleep until 4am, 5am, 6am, or 7am. My sleep cycle is completely disrupted, which obviously affects my general mood as well.

I’ve faced wildly oscillating moods: grief, euphoria, rage, calmness, happiness, anxiety – all which seem to come and go on their own without any external factors. I’m facing an increasing disconnect between my internal emotions and the brave face that I want to project outside. As I mentioned, what bothers me is the lack of control when I don’t know what’s causing these fluctuations.

Life goes on, work goes on. It’s helped that this happened over the holiday season, when I’ve had time off from work to deal with it. But during the weeks when I was at work, I found myself staving off panic attacks by rushing out of the building for fresh air or for a smoke or crying in toilets that I know are usually deserted. I’ve broken down many times – including during a hackathon event that I helped organise. And then back, as if nothing happened.

More worryingly, I’ve had recurring thoughts of self-harm. Like I’ve written about earlier, I’ve never been suicidal; pain, for me, has always been a way of dealing with distress, a sense of release. Not being able to sleep drives me up the wall. Even though I haven’t had thoughts of self-harm in the past months, I’ve worked with my therapist on coping strategies, just in case. For instance, I’ve thrown away bandages and antiseptic wash because knowing they would be there was an enabler for me in “safely” cutting myself. But I’ve found myself berating for being stupid when I’ve had thoughts of self-harm, because I would do it “just this once” and I shouldn’t have thrown those items away.

I’ve had the urge to binge eat then force vomit. Not that it matters, because I often feel nauseous anyway.

I’ve had the urge to burn myself with cigarette stubs; as I get more desperate, I’ve imagined doing in increasingly painful ways: moving on from my limbs to my face to inside my mouth. Or my eyes. Whatever makes this go away, even temporarily.

I’ve had hallucinations – or perhaps that’s the wrong word. Half-awake dreams? Visions? While I’ve still been able to distinguish between reality and imagination, I’ve felt strongly that as if my coffee machine or my phone or my desk are part of my body and oh god it’s itching so bad and that I need to scratch it until it bleeds. I’ve felt that my bedroom walls are actually just dirt that I can dig out through. I’ve found myself obsessively and laboriously cleaning every surface in my home with cleaning spray and single portions of toilet paper, just to keep myself occupied. Or watching myself into a Netflix coma of endlessly-loading TV shows. Or obsessively gnawing on headphone wires. Anything and everything to regain control.

My employer offers a 24-hour employee assistance hotline that’s operated by a third-party provider. Even though I’ve wanted to, and even though I’ve used it in the past when things were way better, I haven’t been able to bring myself to lean on this avenue for support. People on the outside, with no knowledge of who you are as a person necessarily err towards the side of caution – and they should – but rationally or irrationally I’ve stayed away from it, in case they call in the emergency services or tell my workplace. I don’t want that escalation, just someone to talk to in that mental state, and personally I’m not comfortable with telling them these thoughts when there’s even the slightest possibility of getting formally escalated. Because that would only worsen my sense of control over the situation.

“No matter how bad things are, you can always make things worse.”

– Randy Paush, The Last Lecture

All of this sounds scary and bad and alarming, I know. I haven’t done any of things – no matter how strong the urge has been – because even in my worst moments I know that actually doing it would be a slippery slope that leads to much, much worse.

But I’m also trying to finally be honest, brutally so, because facing what I was going through publicly helped me the last time I was in such a bad place. I haven’t spoken this openly – not in its entirety – even to my friends, because in a way I must have realised that admitting to them would first involve admitting to myself that things were bad. Invoking one of my personal idols, Paul Carr, (once again) who decided to quit drinking publicly…

When I decided to stop, I wrote an open letter on my blog, explaining that I had a serious problem with alcohol and asking for the support of those around me. Posting on Facebook or Twitter for just your friends would work just as well. If you’re worried about your professional reputation if you “come out” as an addict, you might want to consider sending a group email to a dozen or so people you trust. Believe me, word will get around. The key is for people you encounter on a day-to-day basis to be aware that you have a problem and are trying to fix it. Those people are the ones who will be your greatest allies in quitting.

I’m scared shitless and I have been for the past month. I want to put an end to that by taking the step of writing about this because, once again, I’m tired of hiding. I feel the element of having to tell my friends that I lost was playing on my head. So I’m throwing that out of the window by telling everyone. I’ve come to terms with the fact that if do have to get back on antidepressants, I will be able to do it. That I’ll not see it as “losing”. That it will be the right thing to do. Or at least, that’s what I believe I can bring myself to do.

While it’s certainly unusual to get antidepressant discontinuation syndrome with a slow taper off (like the one I’ve gone through) and specifically with the medication I was on (citalopram), there is still a chance that it explains everything that I have been going over the past month. I’m hoping that is the answer.

Maybe it’s too early to declare victory…on my plan to ultimately get better. But I’ll take that. I’m resuming appointments with my psychiatrist and therapist, now that the holiday season is over and they’re back. I can’t call this a New Years’ “resolution” but I hope and wish that going off antidepressants this year will be first step towards a final resolution of my mental health issues. At least for this time.


The last time that I spoke publicly about my problems was in June 2013, when I found that I won’t be graduating with everyone else at university. Back then, I didn’t know what impact my ongoing and severe depression problems would have in my life. I didn’t know whether I would even have a degree by the end of 2013, let alone a job. I thought my life was quite fundamentally fucked and that I had no future to look forward to.

I did, ultimately, get a degree in September 2013 (without having an official graduation ceremony). Having a degree at hand and all my work experience would have been of little consolation if my terrible university results (a final degree classification of 2:2) precluded me from finding a permanent job. I’d been progressing with multiple conditional job offers at that point – conditional on my degree results, for which every employer needed a minimum of 2:1 – without a final contract in place.

My student visa in the UK expired last year October 14, the day I left the country. On October 16th the same week, I had a confirmed contract from Accenture on the table.

A lot happened between then and this year in February 2014, when I finally accepted that offer and joined Accenture – my current job. I had another job, took time off, got to meet friends I hadn’t seen for years…but coming back to the narrative I started with, I was still on the highest dose of antidepressants possible. By this point I was also battling with major anxiety disorder issues and insomnia.

Depression is a bastard that doesn’t particularly care about how well your life is going on paper, as it was for me. A consistent lack of sleep was affecting my anxiety issues was affecting my sleep was affecting my depression was affecting my depression. My biggest worry was that all jobs that I’d had so far were either at startups, universities, or small businesses. I didn’t know how or whether I’d be able to work in a large consulting company (Accenture is as close as it gets to how large a corporation can be) while still being able to deal with my personal issues effectively.

I’d never disclosed my mental health issues in a workplace context, ever. By nature all the teams that I had worked until that point were small and tightly-knit – 3-10 people – and even though I’d never told them my issues, the sense of bonding was enough to tide any issues over. How would it turn out when I was part of The Machine? (I say that in the nicest terms, because I was apprehensive about the workplace culture change.)

As hard as it was for me to do, I knew that I needed support…so I decided to “come out” to my company. Not to everyone, but at least to HR and my line managers so that I got the support that I needed with flexible working times and time-off while getting treatment. I got a little courage from the fact that Accenture seemed to have well-defined processes in place to handle these situations – and that there were protections that I had under UK’s Equality Act.

And I have to say, I’ve been completely blown away by the level of understanding and empathy that I’ve received. I’ve had occupational health counselling to determine what adjustments could be made, monthly catchups with my HR advisor on how I was doing, the ability to be a part of the company’s flexible working programme.

I attended a workshop for the launch of an initiative within the company called “Mental Health Allies” a couple of months ago, which is a step Accenture is taking for helping employees with mental health problems. There’s one comment that stood out to me, from someone within the leadership team: that 8-10 years ago, when companies talked about such issues, the question they were asking themselves is “Are we liable for this?”; now, the conversation has moved to “How can we help our people?”

I’m genuinely thankful that I’m part of the generation where social norms around depression and other disorders may finally be moving on from outright stigma. (Who know, even this might be too early to celebrate.) I also feel glad and comforted by the fact that at least at Accenture, I’m part of company that takes the wellbeing of its employees very seriously.

This support that I got at work was instrumental in helping me feel better. That I had support structures that I can count on, including taking time-off should I feel the need to.

Work…gives me a sense of purpose. It helped me battle my depression because it gives me a reason to get out of bed. It motivates me to get better by doing the best job I can.



Selfie time! It was this professor’s first selfie ever – he said he’d heard of this new thing, but never participated. Glad I helped him tick that one off his presumed bucket list, because I’m a millenial like that.

I wanted to end this blog post on a happy note. I did, finally, get to attend a graduation ceremony this year. Without my invites being pulled at the last-minute, or not getting the awards for exceptional performance that I got at university that I was supposed to get.

Graduation is obviously an emotional moment in every university student’s life. But the vindication that I felt from proving to myself, and proving to all those who said I was in too bad shape to graduate and maybe actually I should try again later, wrong.

Looking back, I now know that my university lecturers did try to do the right thing and weren’t just out to get me. And I can also finally acknowledge that I did find a lot of support through the exceptional people at Surrey University’s Centre for Wellbeing, all of which was critical in helping me get better. And of course, my friends, who in their own ways – even without directly talking about my problems – helped me get over my problems.

That was the moment that I finally felt that I’d gotten over my depression, because I could close the university chapter of my life, the time during which I was at the height of my depression. I’m not going to be in denial if I need to get help again, on this occasion. But it was also a reminder to myself that I want to get better…and I can.

I want to live. Bring on 2015.


Amazon’s Fire Phone is incredibly smart…and what it means for the future of smartphones

The announcement of the Amazon Fire Phone is one of the most interesting technology news I’ve come across in recent times. While the jury is out on whether it will be a commercial success or not (UPDATE: recent estimates suggest it could be as low as 35,000), the features that the phone comes with got me thinking about the technical advancements that have made it possible.

The caveat here is much of what follows is speculation – but I do have a background in research projects in speech recognition and computer vision related user experience research. I’m going to dive into why Fire Phone’s features are an exciting advance in computing, what it means for the future of phones in terms of end-user experience, and a killer feature I think many other pundits are missing out.

Fire Phone’s 3D User Interface

Purkinje image
Glints off an eye are used to correlate a known position to unknown

I did my final year research project on using eye tracking on mobile user interfaces as a method of user research. The problem with many current methods of eye tracking is that it requires specialised hardware – typically the approach is to use a camera that can “see” in infrared, illuminate the user’s eye using infrared, and using the glint from the eye to track the position of the eyeball relative to the infrared light sources.

This works fabulously when the system is desktop-based. Chances are, the user is going to be within a certain range of distance from the screen, and facing it at a right angle. Since the infrared light sources are typically attached to corners of the screen – or an otherwise-known fixed distance – it’s relatively trivial to figure out the angles at which a glint is being picked up. Indeed, if you dive into research into this particular challenge in computer vision, you’ll mostly find variations of approaches on how to best use cameras in conjunction with infrared.

Visual angles
Visual angles

The drawback to this approach is that the complexity involved vastly increases when it comes to mobile platforms. To figure out the angle at which glint is being received, it’s necessary to figure out the orientation of the phone from it’s gyroscope (current position) and accelerometer (how quickly the pose of the phone is changing in the world). In addition to this, the user themselves might be facing the phone at an angle rather than facing it at a right angle, which adds another level of complexity in estimating pose. (The reason this is needed is to estimate visual angles.)

My research project’s approach was using techniques similar to a desktop-based eye tracking software called Opengazer coupled with pose estimation in mobiles to track eye gaze. Actually, before the Amazon Fire Phone there’s another phone which touted it had “eye tracking” (according to the NYT): Samsung Galaxy S IV.

I don’t actually have an Samsung Galaxy to play with – nor did the patent mentioned in the New York Times article link above show any valid results – so I’m basing my guesses on demo videos. Using current computer vision software, given the proper lighting conditions, it’s easy to figure out whether the “pose” of a user’s head has changed: instead of a big, clean circular eyeball, you can figure out there’s an oblong eyeball instead which suggests the user has tilted their head up or down. (The “tilt device” option for Samsung’s Eye Scroll, on the other hand, isn’t eye tracking at all as it’s just using the accelerometer / gyroscope to figure out the device is being tilted.)

What I don’t think the Samsung Galaxy S IV can do with any accuracy is pinpoint where a user is looking at the screen beyond the “it’s changed from a face at right angle to something else”.

What makes the Fire Phone’s “3D capabilities” impressive?

Watch the demo video above of Jeff Bezos showing off the Fire Phone’s 3D capabilities. As you can see, it goes beyond the current state-of-the-art that the Galaxy S IV has – in the sense that to accurately follow and tilt the perspective based on a user’s gaze, the eye tracking has to be incredibly accurate. Specifically, instead of merely basing motion on how the device is tilted or how the user moves their head from a right angle perspective, it needs to combine device tilt pose, head tilt / pose, as well as computer vision pattern recognition to figure out the visual angles the user is looking at an object from.

Here’s where Amazon has another trick up its sleeve. Remember how I mentioned that glints off infrared light sources can be used to track eye position? Turns out that the Fire Phone uses precisely that setup – it has four front cameras, each with its own individual infrared light source to accurately estimate pose along all three axes. (And in terms of previous research, most desktop-based eye tracking systems that are considered accurate also use at least three fixed infrared light sources.)

So to recap, here’s my best guess on how Amazon is doing it’s 3D shebang:

  • Four individual cameras, each with it’s own infrared light source. Four individual image streams that need to be combined to form a 3D perspective…
  • …combined with device position in the real world, based on its gyroscope…
  • …and how quickly that world is changing based on its accelerometer

Just dealing with one image stream alone, on a mobile device, is a computationally complex problem in its own right. As hardware becomes cheaper, more smartphones include higher resolution front cameras (also, better image sensor density so that isn’t just the resolution but the quality which is better)…which in turn gives better quality images to work on…but it also creates another problem in that there’s a larger image to now process onboard a device. This is a challenge because, based on psychological research into how people tend to perceive visual objects, there’s a narrow window – within the range of 100s of milliseconds – within which a user’s gaze rests at a particular area.

On a desktop-class processor, doing all of this is a challenge. (This article is a comparison of JavaScript on ARM processors vis-a-vis desktop, but the lessons are equally valid for other, more computationally complex tasks as computer vision.) What the Amazon Fire Phone is doing is it’s combining images from four different cameras as well as its sensors to form an image of the world to change perspective…in real-time. As someone who’s studied computer vision, this is incredibly exciting advance in the field!

My best guess on how they’ve cracked this would be to use binary segmentation instead of feature extraction. That was the approach I attempted when working on my project, but I could be wrong.

Is a 3D user interface a good idea though?

Right now, based purely on the demo, it seems that the 3D interface is a gimmick that may fly well when potential customers are in a store testing out the product. It could be banking on “Wow, that’s really cool”, as Amazon’s marketing seems to be positioning itself. Personally, I felt the visual aesthetics were less 21st century and more like noughties 3D screensavers on Windows desktops.

Windows 2000 3D pipes screensaver
“Look at all that faux depth!” – An Amazon executive, probably

Every time a new user interface paradigm like Fire Phone’s Dynamic Perspective, or Leap Motion controller comes along, I’m reminded of this quote from Douglas Adams’ The Hitchhiker’s Guide To The Galaxy (emphasis mine):

For years radios had been operated by means of pressing buttons and turning  dials; then as the technology became more sophisticated the controls were made touch-sensitive – you merely had to brush the panels with your fingers; now all you had to do was wave your hand in the general direction of the components and hope. It saved a lot of muscular expenditure of course, but meant that you had to sit infuriatingly still if you wanted to keep listening to the same programme.

My fear is that in an ever-increasing arms race to wow customers with new user interfaces, companies will go too far in trying to incorporate gimmicks such as Amazon’s Dynamic Perspective or Samsung’s Eye Scroll. Do I really want my homescreen or what I’m reading to shift away if tilt my phone one way or the other, like Fire Phone does? Do I really want the page to scroll based on what angle I’m looking at the device, like Samsung does? Another companion feature on the Galaxy S IV, called Eye Pause, pauses video playback if the user looks away. Anecdotally, I can say that I often “second screen” by browsing on a different device while watching a TV show or a film…and I wouldn’t want playback to pause merely because I flick my attention between devices.

Another example of unintended consequences of new user interfaces is the Xbox One advert featuring Breaking Bad‘s Aaron Paul. Since the Xbox One comes with speech recognition technology, playing the advert on TV inadvertently turns viewers’ Xboxes on. Whoops.

What’s missing in all of the above examples is context – much like what was illustrated by Douglas Adams’ quote. In the absence of physical, explicit controls, interfaces that rely on human interaction can’t distinguish whether a user meant to change the state of a system or not. (Marco Arment talks through this “common sense” approach to tilt scrolling used in Instapaper.)

One of the things that I learnt during my research project was there’s a serious lack of usability studies for mobile devices in real-world environments. User research on how effective new user interfaces are – not just in general terms, but also at the app level – needs to dug into deeper to figure out what’s going on in the field.

In the short-term, I don’t think sub-par interfaces such as the examples I mentioned above will become mainstream, because the user experience is spotty and much less reliable. Again, this is pure conjecture because as I pointed out, there’s a lack of hard data on how users actually behave with such new technology. My worry is that if that such technologies become mainstream (they won’t right now; patents) without solving the context problem, we’ll end up in a world with hand gesture sensitive radios are common purely because “it’s newer technology, hence, it’s better”.

(On a related note: How Minority Report Trapped Us In A World of Bad Interfaces)

Fire Phone’s Firefly feature: search anything, buy anything

Photo: Ariel Zambelich/WIRED
Photo: Ariel Zambelich/WIRED

Staying on the topic of computer vision, another one of the headline features for Amazon’s Fire Phone is a feature called Firefly – which allows users to point their camera at an object and have it ready to buy. Much of the analysis around Fire Phone that I’ve read focuses on the “whale strategy” of getting high-spending users to spend even more.

While I do agree with those articles, I wanted to bring in another dimension into play by talking about the technology that makes this possible. Thus far, there has been a proliferation of “showrooming” thanks to barcode scanner apps that allow people to look up prices for items online…and so the thinking goes, a feature like Firefly which reduces friction in the process will induce that kind of behaviour further and encourage people to shop online – good for Amazon, because they get lock-in. Amazon went so far as to launch a standalone barcode scanning device called the Amazon Dash.

My hunch – from my experience of computer vision – is that the end-user experience using a barcode scanner versus the Firefly feature will be fundamentally different in terms of reliability. (Time will tell whether it’s better or worse.) Typically for a barcode scanner app:

  • Users take a macro (close-up) shot of the barcode: What this ensures is that even in poor lighting conditions, there’s a high quality picture of the object to be scanned. Pictures taken at an angle can be skewed and transformed to a flat picture before processing, and this is computationally “easy”.
  • Input formats are standardised: Whether it’s a vertical line barcode or a QR code or any of the myriad formats, there’s a limited subset of patterns that need to be recognised. Higher probability that pattern recognition algorithms can find an accurate match.

Most importantly, thanks to standardisation in retail if the barcode conforms to Universal Product Code or International Article Number (the two major standards), any lookup can be accurately matched to a unique item (“stock keeping unit” in retail terminology). Once the pattern has been identified, a text string is generated that can be quickly looked up in a standard relational database. This makes lookups accurate (was the correct item identified?) and reliable (how often is the correct item identified?).

I haven’t read any reviews yet on how accurate Amazon Firefly is since the phone has just been released. However, we can get insights into how it might work since Amazon released a feature called Flow for their iOS app that does basically the same thing. Amazon Flow’s own product page from it’s A9 search team doesn’t give much insight and I wasn’t able to find patents related to this might have been filed. I did come across a computer vision blog though that covered similar ground.

How Amazon Flow might work
How Amazon Flow might work

Now, object recognition on its own is a particularly challenging problem in computer vision, from my understanding of the research. Object recognition – of the kind Amazon Firefly would need – works great when the subset is limited to certain types of objects (e.g., identifying fighter planes against buildings) but things become murkier if the possible subset of inputs is anything that a user can take a picture of. So barring Amazon making a phenomenal advance in object recognition that could recognise anything, I knew they had to be using some clever method to short-circuit the problem.

The key takeaway from that image above is that in addition to object recognition, Amazon’s Flow app quite probably uses text recognition as well. Text recognition is simpler task because subset of possibilities is limited to the English alphabet (in the simplest example; of course, it can be expanded to include other character sets). My conjecture is that Amazon is actually using text recognition rather than object recognition; it’s extracting the text that it finds on the packaging of a product, rather that trying to figure out what an item is merely based on its shape, colour, et al. News articles on the Flow app seem to suggest this. From Gizmodo:

In my experience, it only works with things in packaging, and it works best with items that have bold, big typography.

Ars Technica tells a similar story, specifically for the Firefly feature:

When scanning objects, Firefly has the easiest time identifying boxes or other packaging with easily distinguishable logos or art. It can occasionally figure out “naked” box-less game cartridges, discs, or memory cards, but it’s usually not great at it.

If this is indeed true, then the probability is that the app is doing text recognition – and then searching for that term on the Amazon store. This leaves open the possibility that even though the Flow app / Firefly can figure out the name of an item, it won’t necessarily know the exact item stock type. Yet again, another news article seems to bear this out. From Wired:

And while being able to quickly find the price of gadgets is great, the grocery shopping experience can sometimes be hit-or-miss when it comes to portions. For example, the app finds the items scanned, but it puts the wrong size in the queue. In one instance, it offered a 128-ounce bottle of Tabasco sauce as the first pick when a five-ounce bottle was scanned.

From a computer vision perspective, this is not surprising since items with the same name but different sizes might have the same shape…and based on the distance a picture is shot at, the best guess of a system that combines text + shape search may not find an exact match in a database. It also a significantly more complex database query as it needs to compare multiple feature sets which may not necessarily be easily stored in a relational database (think NoSQL databases).

How does all of the above impact the usage of Firefly feature?

The specifics on what Amazon Firefly / Flow can and cannot do will have a significant impact on usage. If it can identify objects and items based on shape, in the absence of packing or name, then Amazon Firefly will be a game changer. It’s a hitherto unprecedented use case that can allow people to buy items they don’t even know the exact name for, and hence it creates an opportunity to buy items (from Amazon of course) which they otherwise may not have been able to buy.

If, however, Amazon Firefly can merely read bold text on a bottle / packaging, then it will remain a gimmick. Depending on how good the text recognition is, users may find no significant benefit compared to typing out the name of a product which is clearly visible.

…but even if Firefly isn’t amazing yet, Amazon has a plan

One of the fundamental problems in trying to solve computer vision problems (such as object recognition) is that researchers need datasets to test out algorithms on. Good datasets are incredibly hard to find. Often, these are created by academic research groups are thereby restricted by limited budgets in the range of pictures gathered in a database.

Even if, right now, Amazon Firefly can only do text recognition, releasing the Flow app and the Firefly feature into the wild allows them to capture a dataset of images of objects at an industrial scale. Presumably, if it identifies an object incorrectly and a user corrects it to the right object manually, this is incredibly valuable data which Amazon can then use to fine-tune their image recognition algorithms.

Google has a similar app as well called Google Goggles (Android only) which can identify a subset of images, such as famous pieces of art, landmarks, text, barcodes, etc. At the moment, Google isn’t using this specifically in a shopping context – but you can bet they will be collecting data to make those use cases better.

Dark horse feature: instant tech support through Mayday

Amazon's Mayday button provides real-time video tech support
Amazon’s Mayday button provides real-time video tech support

Among all of the reviews that I’ve read for Amazon Fire Phone in tech blogs / media, precisely one acknowledged the Mayday tech support feature as a selling point: Farhad Manjoo’s article in the New York Times. Perhaps tech beat writers who would rather figure out things themselves didn’t have much need for it, and hence skipped over it. Mayday provides real-time video chat 24 / 7 with an Amazon tech support representative, with the ability for them to control and draw on the user’s screen to demonstrate how to accomplish a task.

Terms such as “the next billion” often get thrown about for the legions of people about to upgrade to smartphones. This could be in developing countries, but also for people in developed countries looking to upgrade their phones. Instant tech support, without having to go to a physical store, would be a killer-app for many inexperienced smartphone users. (Hey, maybe people just want to serenade someone, propose marriage, or order pizza while getting tech support.)

I think that fundamentally, beyond all the gimmicky features like Dynamic Perspective, the Mayday feature is what is truly revolutionary – in terms how Amazon has been able to scale it provide a live human to talk to within 10 seconds (not just from a technical perspective, but also how to run a virtual contact centre.) Make no mistake that while Amazon may have been the first to do it at scale, this is the way the future of customer interaction lies. The technology behind Mayday could easily be offered as a white-label solution to enterprises – think “AWS for contact centres” – or in B2C to offer personalised recommendations or app-specific support.

(Also check out this analysis of how Amazon Mayday uses WebRTC infrastructure.)

What comes next from Amazon?

Amazon's famously opaque charts-without-numbers
Amazon’s famously opaque charts-without-numbers

There aren’t sales figures yet for Amazon Kindle Fire phone. Don’t hold your breath for them either: Amazon is famous for its opaque “growth” charts in all its product unveiling events. It may be able to push sales through the valuable real estate it has on home page – or it might fall flat. Regardless of what happens, this opacity does afford Amazon the luxury of refining its strategy in private. (Even if sells 100 units vs 10 units in the previous month, you can damn well be sure they’ll have a growth chart showing “100% increase!!!”)

One thing that’s worth bearing in mind that Jeff Bezos has demonstrated a proclivity towards the long game, rather than short-term gains. Speaking to the New York Times, he says:

We have a long history of getting started and being patient. There are a lot of assets you have to bring to bear to be able to offer a phone like this. The huge content ecosystem is one of them. The reputation for customer support is one of them. We have a lot of those pieces in place.

You can see he touches on all points I mentioned in this article: ecosystem, research, customer support. Amazon’s tenacity is not to be discounted. Even if this version one Fire Phone is a flop, they’re sure to iterate on making the technology and feature offerings better in the next version.

Personally, I’m quite excited to see how baby steps are being taken in terms of using computer vision (Fire Phone, Word Lens!), speech recognition (Siri, Cortana), and personal assistants (Google Now). Let the games begin!