TigerGraph Developer Spotlights allow you to get to know people using TigerGraph around the world to create powerful connected data solutions answering countless use cases.
Meet David Baker Effendi
Learn about David Baker Effendi, Ph.D. Candidate at Stellenbosch University and Applied Scientist Intern at Amazon, and how he is using his knowledge of graph to develop tools and documentation to help others learn TigerGraph!
Tell us a little about yourself, where you are from, and what you are currently working on.
I am from South Africa, and I am currently attending Stellenbosch University. I’ve been studying there since 2016 when I started my Bachelor of Science in Computer Science. I am now a Ph.D. Computer Science candidate and my Ph.D. research is mostly static program analysis, more specifically taint analysis, which is a type of analysis where we track data flow and ensure that sensitive data isn’t leaked or malicious data coming in doesn’t reach something sensitive. We represent the program as a graph, which is where a lot of my expertise with graph databases becomes particularly useful. I am currently an Applied Scientist Intern on the Amazon CodeGuru team where the service uses program analysis and an internal graph structure to automatically check for any bugs and the quality of customer code.
What interested you in TigerGraph, and what has it been like working with the platform?
In my fourth year of studying, I started with a project where we wanted to investigate how suitable graph databases were for analyzing spatial-temporal data. At the time, my research supervisor was interested in graph databases, and assigning students to research and build projects was a way we could all learn more about the topic. Going into the project, he suggested I try out a few of the older graph technologies. However, these were very memory heavy. At that time, attending university, we didn’t have access to super big, robust hardware. So I looked around, and TigerGraph came up as a suggestion, so I decided to try it. It was definitely the best in terms of being able to model the full size of the data set I was using within my hardware constraints, and that’s how I started working with TigerGraph. After that, I was pretty involved in teaching a web development course. As my research veered away from databases, I focused more on using the graph structure in static analysis. Stellenbosch University still uses TigerGraph to introduce database visualization and understanding data in the courses we teach.
What has been your favorite graph-based project that you have worked on?
I would say it’s what came out of my master’s, which is a project called Plume. The conclusion of that project was submitting it to the TigerGraph Graphathon 2020 and winning a prize. The project is about marrying static analysis and graph representation with graph databases and comparing their performance. It’s quite a big undertaking. From a design perspective, we had to understand how to run the same operations and recycle them to work on an abstracted database layer, which forces you to almost directly compare each task you want to do and how it’s done for each different database type. This project helped me understand the pros and cons of every supported graph database, and it was quite a lot of fun. And I guess this project is why many people call me the graph expert at the university.
How would you describe your experience participating in the TigerGraph Graphathon?
It was really exciting to work on something and be on the same playing field as others. South Africa is pretty far from a lot of the big tech stuff that happens. We have some exciting tech stuff happening, but not many things that I’m interested in, like static analysis, which is why I’m here in the Bay Area. The exciting technology in my focus topics is mainly researched and developed overseas. So participating in an online hackathon in that domain was very exciting. Winning made me feel proud of the standard that the university is teaching, and I was amazed at what I’ve learned from remote resources and talking to the TigerGraph team. The Graphathon helped me see my potential and how it is not limited to my geographical location.
Speaking of amazing projects, you also created a TigerGraph Yelp Starter Kit. Can you tell me more about that project and what inspired it?
The TigerGraph Yelp Starter Kit was something I worked on after my honors year in university. My honors project used the Yelp dataset as a candidate spatio-temporal dataset in the research project I mentioned earlier. In South Africa, we do three years, then you graduate. In the fourth year, you complete honors, and then you go on to do your master’s. During my first year in masters, I wanted to do something to help other students who don’t have a year to learn the full extent of graph technology. Students have a couple of weeks to set up a database, ingest the data, and then do something cool with it. Creating this starter kit was a way to help students and anyone who managed to stumble across it quickly get started with graph. With the starter kit, people can access large datasets, import, process, and play around with data. One of my biggest challenges was not finding interesting datasets natively structured for graph databases, so I wanted to help others have a better experience in this aspect.
It’s amazing that you used your time during your masters to build something that would help others! What would you say has been the most influential resource on your journey?
Key developers have been one of the most influential resources for me. Often, that’s because they’re the main knowledge base of specific tools, technologies, or theories. I’m very grateful for their time and willingness to help me. Outside of that, I would say that well-written books, online resources, and documentation have been a great help. Often, the one thing that gets lost on the line is being able to explain the usability of the products you produce. That is why I create a lot of starter kits and documentation around the projects I work on, to foster the habit of providing well-written documentation for others.
Speaking of well-written documentation, you’ve written a couple of blogs, including Integration Testing with TigerGraph and Efficient Use of TigerGraph and Docker, but if you were to write a book, what would it be about?
I would probably write a technical textbook. Thinking back on my experience having to learn Scala, the biggest impact was finding a helpful textbook for the relevant version and using the surrounding tools. That is because it gave me a clear way to view and understand how language works in various contexts and use cases with sufficient depth. Surprisingly, books were much more efficient than listening to lectures or trying to Google everything or use StackOverflow, which doesn’t necessarily foster deeper understanding. My textbook would be about my research in static analysis. The tools I use, how to develop your own, and the common misunderstandings when getting into this field.
We look forward to seeing more documentation coming from you! Is there anything else you want to share about yourself?
The main open source project that I help maintain and work on is a project called Joern. We essentially use a type-safe purpose-built graph database to analyze programs at scale, and it’s built to be programming language-agnostic. I’m currently working with two Stellenbosch honors students to add support for Solidity to analyze blockchain languages and apps and adding a REST server to analyze programs remotely and gather user data.
Connect with Other Developers
There are hundreds of TigerGraph developers talking daily in TigerGraph’s developer chat. If you would like to get help from others, discuss ideas, or just meet other graph enthusiasts, check out TigerGraph’s Developer Chat!