RedMonk analyst Kate Holterhoff and Allen Romano, co-founder of Logoi, chat about their experiences as academics turned tech industry professionals. They discuss the discipline of digital humanities; how founding a startup both resembles and diverges from PhD work; and the future of AI.

This was a RedMonk video, but was not sponsored by any entity.

Rather listen to this conversation as a podcast?

Transcript

Dr. Kate Holterhoff: Hello and welcome to this episode of RedMonk’s The Docs Are In series. I’m joined today by Dr. Allen Romano, a former academic and co-founder of Logoi, an AI powered, searchable knowledge base. Alan, thanks so much for joining me today.

Dr. Allen Romano: Well, thank you for having me. I’m looking forward to talking about all of this.

Kate: Great. Allen and I just missed meeting in person earlier this year in Denver. And so while we were both attending Gluecon, you know, it was a pity our paths just didn’t quite cross at the event itself. But we learned afterward that we have a lot in common and our background, both as academics, but also in the discipline of digital humanities, really makes his story particularly compelling to me. So on this series we’re really digging into how academia and the tech industry intersect. I’ve given a real quick history of Allen’s credentials here. But I want to hear more about your decision to leave the university and co-found a startup. So could you talk about what that journey was like and what led you to decide you were prepared to leave academia’s marble walls behind?

Allen: Yeah. So the digital humanities side of it is a more recent one, and that’s really the second time leaving academia. So this is a long time coming for me in a lot of ways. I mean, I spent a long time as a professor and the first step was essentially moving out of classics where I had done a lot of digital works. But my early training was in ancient Greek and I was doing things with digital text. And so that transition really was what set the stage, was moving away from focusing so much on a certain kind of research and a certain kind of narrow focus on research to doing a whole lot of teaching, a whole lot of thinking, particularly with that program and building that digital humanities program at Florida State. Thinking about how students who might have a humanistic background or might be interested in humanities material can get technical training in order to have jobs that aren’t just in the academy, aren’t just in the university. And so in essence, I was sort of training all these other folks, particularly graduate students, and then around me in the university, how to do web development, how to do natural language processing for humanities kind of projects.

Allen: But with this idea that that would be training for then getting a job in these other fields. And really what happened in a certain way was watching them get jobs and being successful and thinking, Wait a second, I taught them how to do that. And this looks like a lot of fun and I need to take some of that elsewhere. So that was a big part of it. And then there’s definitely a huge part of it too — digital humanities as a field deals with very pressing issues actually about how technology is used in society, how technology is adopted, what effects technology has on us. And I taught a whole lot in that area. And just this growing feeling of I need to be out there doing something there as opposed to sort of looking at it from afar. And that really was one of many impulses that made me move into it, particularly an entrepreneurial direction coming out of the university.

Kate: Yeah, that reflects my own experience of seeing all this cool stuff that could be done and feeling like I wanted to be creating that sort of technology as well. And, you know, being involved at the forefront and not just recording the history of it. So yeah, that’s a perspective that makes a lot of sense. So I think what would be a really good thing for us to talk about now would be digital humanities as the discipline. A lot of folks who watch this series probably aren’t familiar with it. It’s definitely an academic sort of subset of humanistic inquiry that we you know, I haven’t seen it get outside of those borders as much. So, would you start by defining what digital humanities is for you and how you treated that subject? And —

Allen: Yeah!

Kate: Can you draw any threads between the tech industry and DH that maybe I haven’t thought of?

Allen: I think it’s one of those things — I’ve never been a huge fan of that term, Digital Humanities. I don’t know about you, but I always find it kind of difficult. So I explain often, an area that’s more at home for me, which is, oh, it’s computation on history and literature, you know, everything that Google was doing to your emails, but we can do to Shakespeare or something. That’s one quick way to get your head around it. But of course Digital Humanities is way more than that. And I think the way I often describe it for students who are coming to that field is that it’s the area of humanities that’s really focused on methods of the here and now, both implementing them — technological methods — and thinking through their consequences. And and so you run the whole spectrum from very technical work around — in my case linguistics and computational linguistics — to work on society and history and politics even, and how these things interact around any number of issues of interest to humanists, both in the present and the past. So, it becomes a really flexible framework. And I think especially outside of academia, it’s really in some ways always a pity that so much of it, like you said, it’s not restricted to academia, but it doesn’t have necessarily the visibility. Most of what’s going on in my mind in AI and a lot of these areas nowadays is Applied Humanities. It is essentially Digital Humanities or some form, and there are plenty of digital humanists in these tech companies doing this kind of work. But it’s not framed like that. It’s not framed as Applied Humanities. And in a lot of ways I wish it was.

Kate: Right, yeah, I know. When I bring up Digital Humanities, I get a lot of blank stares in my current role, so I kind of push that area of my history back a little bit to the background. But yeah, I mean, when I did DH work, I studied a lot of archives and trying to scan documents that are hard to find online. So there was a sort of, I guess, library science component to my DH work.

Allen: Library science, humanities — it’s way beyond arts and sciences at a university. By nature it’s inherently interdisciplinary. There’s this whole thing about humanities data that I think is a connection between digital humanities and a lot of what’s going on now, where you’ve got all these technologies that are dealing with very human data, but coming at it not explicitly from a humanistic point — they may not have to come from that point of view — but where a humanities background really does help a lot. Because, as you describe from your own experience and working in archives and working with that kind of data, you have experience with the messy, unstructured data that is just bread and butter of humanities research and humanities teaching and humanities work. And sometimes in these discussions it feels like people are surprised by this. And it’s like, well, humanists aren’t surprised by this. This is exactly what they dealt with. It’s when you’re coming to it expecting clean data and that you get surprised by it. So there’s really something about the moment now with data in particular, before we even talk about AI, that I think is a connection with digital humanities.

Kate: Yeah, absolutely. I think the current wave of AI engages the humanities at this really deep level. So we have ChatGPT. It’s this tool for writing. Dall-e executes visual art. And you know, many of these new AI tools and technologies, they really depend on collecting, organizing and retrieving data. And we keep reading blogs and hearing podcasts about how AI is data. I mean, that is really the foundation of all these new innovations in that space. So, a subject that Library sciences have managed for millennia is this sort of data aspect. So it makes a lot of sense that our definition of DH would integrate that as part of the remit.

Allen: Yeah, I used to — one of my favorite courses to teach was this course called Technologies of Memory, which I kind of invented for grad students at University of Chicago and then taught as an undergraduate course for many years. But it really was this long history of more or less humanities data and memory studies meets history of technology and, and digital humanities essentially. But we’d go from tablets, we’d go from cuneiform tablets and list making through catalog poetry in the ancient world as a form of technology, a form of memory. We’d go to memory palaces and medieval rhetoric and all these kinds of things, all the way to LLMs and deep learning and the most recent kinds of technological advances. And when you look at that really long history, I think there’s a whole lot of things that become really striking about the technological moment, not just that it’s not new in some ways, but that issues which often get a lot of press, suddenly you say, Oh, well, this isn’t the first time we’ve thought about, say, authenticity, right? Or the first time you’ve thought about what it means to be a mimic, the way LLMs are mimics. You know, how do you assess that in terms of intelligence? There’s really interesting work on that at all periods. And and so that kind of history has always been really exciting to me. But I think it’s also where that’s part of humanities data. It’s not just the technical dealing with it, but it’s also that kind of way of putting in dialogue and saying, okay, well, what would Plato think about ChatGPT? What would different people in the Renaissance or outside of Europe, like in the Near East, in ancient Africa, wherever it is that that brings a radically different perspective. Which is, I think, really helpful to defamiliarizing what’s going on right now and allowing you to sort of think differently about it.

Kate: So would you say that this new LLM space, the AI, machine learning, all of this is really engaging humanist type questions in a deeper way than maybe other areas in the tech space? I mean, is that maybe what drew you to co-found an AI startup or is it just more of the same? Are we just, you know, you just happened to come into it at that time?

Allen: Well, yeah. So there’s a degree of both, right? It was very fortunate that I focused on language. And I worked with students on Gpt2 and earlier language models. So that was fortuitous. And yeah, one of the things I always did myself was building an early language model with Greek, ancient Greek, to get it to generate new Greek a number of years ago. And that’s fun. It’s interesting. But really the thing that was, I think, more motivating and intentional was that there’s these questions about knowledge, how we know things, how if you’re using a language model in particular, it changes the way that you get information. It changes how you look back at a huge amount of compressed information. All that data they were trained on, right? Which kind of gets expanded out when you give it a prompt in a certain direction. And there is a way in which it’s an access to a whole lot of the communicative record of the past. Yes, it’s a lot of Internet content, but along with that are things like most of the literary record that’s online, right? Things in a number of languages. I mean, for someone who studied classics to get it to generate ancient Greek is… It’s like a whole bunch of fantasies from doing digital humanities work where it’s like, Oh, look, it does it, okay? And now we can explore how does it do it that way? Why does it do it that way? What does that tell us? And then those have implications for using these in a startup, using these as tools for building blocks of new technology and really sort of creativity around that.

Allen: That’s the thing I think drew me ultimately is like this flexible set of tools has increasingly — and as it exploded this past year especially, we were working on them before that, well before November or October when it sort of flashed into folks consciousness. But that accelerated a whole bunch of changes and awareness around these tools, which meant building things became both a lot faster, a lot more stuff out there to think about. And that’s been very exciting to think about how you can add value with this increasingly growing toolkit that’s developing rapidly than than one would have expected a year and a bit ago if you had asked anyone, I think.

Kate: Right. So I feel like I have a good sense of what drew you to AI and machine learning in terms of the content that your company focuses on. But is there anything from your history, your training as an academic, your teaching background, your research interests, that also you felt really well fitted you to succeed as a founder of a company?

Allen: Yeah, it’s interesting because you’d think PhD training and founding a startup — I would have thought those were really different things. But I think they’re actually pretty similar in lots of ways. There’s lots of differences one could draw between academia, like being a professor and a startup. I mean, the main thing there is pace, right? You just get to move so much faster in a startup. But in a lot of ways I wonder if PhD students in particular would gain a lot from thinking about what they’re doing in terms of launching their own startup, right? That they are launching their own business and moving quickly towards learning in particular. That you’re trying to get your bearings, get as much awareness of everything going on in that field. You’re really becoming a professional as a PhD student. There is a version of that in the startup world which a friend of mine who was further along, much further along in the startup journey and as a CEO now of a tech company. But she she said very clearly, the one thing that’s guaranteed in a startup journey is that you’ll learn. And that was very striking. She’s also a form of an academic, but because, of course, that’s, I think, the fear that folks might have if you’re in a kind of academic bubble, is like, Oh, I’m in a place where learning happens.

Allen: But it was wonderfully true that the start up journey has delivered completely on constant learning and a really high degree of learning. Not just new skills, but just whole areas that you can get up to speed on really fast and you can interact with other people who are fairly generous about a community of knowledge and practice. And that that is, I think, something that I think is much more similar between PhD journey and start up journey. And it translates pretty directly in the sense that if you did all that learning, that’s what you built up your skill set as a PhD student — being able to absorb things rapidly to sort of get up to speed, to work with peers who were working on other projects, things like that. That skill translates very directly in my mind before we even get to technical skills. Just that it activates that same center.

Kate: Do you think after speaking with a number of folks who’ve gone through this PhD journey and other folks who’ve left academia, do you think it’s more difficult to start a company or to complete your PhD?

Allen: Starting a company I think is more difficult. I think I finally came down on that, that PhDs can take a lot longer. They can like linger in that way but starting a company there is so much that one could do and it’s that kind of constant focus and editing and you have to do it so quickly that I think that’s the difference. That makes it way more challenging. There’s a lot more comfort in that kind of PhD journey. You can kind of slowly find your focus.

Kate: Yeah, yeah, I get that. I mean, the PhD journey is really centered on your mentor. And of course the politic answer is always like, it depends, right? It depends on who your PhD dissertation director would be and how your program looks and all those sort of things. What your funding situation is. But I can see what you mean that there is maybe a sort of loneliness or the stakes are a little bit different, but in some ways they’re higher in the sense that you have less time to do it and and less support. And, you know, there’s a lot of money involved as well, so.

Allen: Yeah. And I would say that the excitement in a lot of ways, the startup journey regains a lot of the excitement. I think that a lot of people have when they go into a PhD where that sort of newness to things and the fact that you kind of have some control over it in some ways, right. That this is your journey, you know?

Kate: Yeah, I like that. Yeah. I mean you lose control when you start the PhD program. It takes a certain type to finish it.

Allen: Yeah, exactly.

Kate: All right. Well, since you have done so much in the space, I know I’m curious, where do you think the AI/LLM/machine learning space is going in the future? What is it going to look like in five, ten years and what are you most excited about?

Allen: So here’s where I think the good answer would of course be to say making predictions is a very dangerous game, particularly in this space, because we’ve just, I think, seen what what no one expected to happen around unstructured data and particularly generative tools that accelerated very rapidly. So I think most predictions that I’ve seen that I more or less agree with will point to this continued acceleration. But I think what’s interesting is, I don’t think it’s actually this direct line to some sort of all capable intelligence. I think that kind of distracts us a lot from the other technologies along the way. And so where I tend to focus most is that what you see, the kind of hype and noise is around a certain set of core functionality with something like ChatGPT or Dall-e or Midjourney or something where it’s going to give you a really nice imitation of something. But where I think the use cases that are going to expand rapidly are, and where we’re going to see just a whole bunch of new things, is that these can act as sort of universal translators. And as you start connecting them, you’re going to see ways of translating one domain to another.

Allen: Structured, unstructured at the level of data. Of course, multimodal things everyone knows are on the horizon with image and audio and text all being generated together or sort of integrating with each other. And I think that’s where there’s new category possibilities that I’m really interested in. What is it that you suddenly have, like fluid interfaces that are streaming from large language models or other sorts of generative technologies. Does that change the way you do front end if it’s not just an API that you’re calling for something? So those kinds of creative possibilities are really interesting to me alongside, of course in the AI space itself, I expect better algorithms, like all the research that’s clearly been going into that there’s an arms race going on. And I can’t imagine that that doesn’t end up with fairly radically powered tools compared to what we’re looking at now. So I think there’s those two levels, but I’m really in a way more interested in what might emerge from the combinations of these things which I think will be kind of different from what we see there right now.

Kate: Well, dangerous as it is to make predictions, I appreciate you getting out your crystal ball for us. And so I want to thank Dr. Romano for coming on The Docs are In. I will provide links to Logoi and Dr. Romano’s social handles. And with that, the Docs are out.