15-Chris-Bensen.txt

Chris Bensen with a Massive Raspberry Pi Cluster
Jim Grisanzio with Chris Bensen

Duke’s Corner Podcast — November 7, 2022

Duke’s Corner Podcast with Oracle engineer Chris Bensen about the massive Raspberry Pi cluster he was showing at JavaOne & Oracle CloudWorld in Las Vegas. The cluster was connected to Oracle Cloud and ran a variety of technologies, such as Java, Linux, Oracle Database, MySQL, and more.

https://dukescorner.libsyn.com/bruno-souza-live-at-javaone-las-vegas-2022

Transcript

(00:00:00):
Hey, it’s Jim.

(00:00:01):
Welcome back to Duke’s Corner.

(00:00:02):
So up next here is the second interview from JavaOne.

(00:00:05):
This is with Chris Bensen.

(00:00:06):
I’ve known Chris for years.

(00:00:08):
He’s an engineer.

(00:00:09):
He’s a hardware hacker.

(00:00:10):
He’s an open source enthusiast at heart.

(00:00:13):
And he’s going to be talking about version 2 of the

(00:00:17):
Raspberry Pi cluster that he and a bunch of others have been building for a couple of years.

(00:00:22):
Actually, a few years now.

(00:00:23):
The first version of this was rolled out in 2019 in San Francisco at Oracle Cloud World.

(00:00:28):
And this is the same machine, but it’s rebuilt and updated.

(00:00:32):
And he’s got more technologies on there.

(00:00:33):
Obviously, Java’s on there.

(00:00:35):
Linux is connected to Oracle Cloud.

(00:00:37):
It’s basically a showcase of multiple Oracle technologies.

(00:00:41):
And what’s cool about it is,

(00:00:42):
aside from a technology point of view,

(00:00:44):
which he’ll explain in the conversation…

(00:00:46):
What I like best is how he works with other organizations across the company,

(00:00:52):
other engineering teams who are interested in building this thing.

(00:00:57):
And so that’s really great.

(00:00:59):
It’s a great story.

(00:01:00):
And yeah, so I’ve got about four or five more of these conversations from Java 1 with various people.

(00:01:07):
And that’s it.

(00:01:08):
Talk to you soon.

(00:01:10):
Hey, everybody.

(00:01:11):
How you doing?

(00:01:12):
This is Jim Grisanzio from the Java Developer Relations team,

(00:01:15):
and I’m here with my buddy Chris Benson here at Java 1 in Las Vegas.

(00:01:19):
And Chris is an engineer who built this amazing machine behind us here,

(00:01:23):
and he’s going to tell us about it.

(00:01:24):
Chris, welcome.

(00:01:25):
Hi, Jim.

(00:01:26):
Thanks for having me.

(00:01:27):
This is a lot of fun.

(00:01:27):
Yeah, so we built the world’s largest Raspberry Pi cluster that we know of.

(00:01:31):
We haven’t got the Guinness Book.

(00:01:34):
I guess the app we know of is sort of important, right?

(00:01:36):
That is important because,

(00:01:38):
you know,

(00:01:38):
without the Guinness Book,

(00:01:39):
you can’t say it officially,

(00:01:40):
unfortunately.

(00:01:42):
So how many are here?

(00:01:43):
We have 1050 Raspberry Pi.

(00:01:45):
Right now running is 1012.

(00:01:49):
I don’t know what’s happened to the other 38.

(00:01:50):
They’ve kind of checked out or maybe gamma rays because we’re network booting every

(00:01:54):
single Raspberry Pi off Oracle Linux and they’re all running Oracle Linux.

(00:01:58):
and running GraalVM with Java and there’s some Python too.

(00:02:02):
So there’s lots of glue code in here and we’ve got pretty much every Oracle,

(00:02:08):
well not every Oracle technology,

(00:02:09):
but as many as I could cram into this blue box that kind of looks reminiscent of a

(00:02:15):
certain spaceship from a TV show.

(00:02:18):
Now, when we first actually displayed this machine, it was in 2019 in San Francisco.

(00:02:24):
That was the first version of it.

(00:02:26):
So talk a little bit about some of the updates that you did,

(00:02:28):
because you told me actually yesterday that you completely rebuilt everything.

(00:02:32):
You took it apart completely, 100%.

(00:02:34):
I’d say about 90% was taken apart and replaced.

(00:02:40):
So the first time around, and I guess, yeah, that’s a good point.

(00:02:44):
Let’s call that 1.0.

(00:02:46):
That was a Java community project that was running 100% Java through and through.

(00:02:51):
The Gluon team was kind of spearheading that with Johan.

(00:02:57):
And that was pretty awesome.

(00:02:58):
They had a great thing.

(00:02:58):
We had a big video wall that was really showing off Java effects.

(00:03:03):
This time around, we’re trying to show off connectivity to the cloud, as well as Graal VM.

(00:03:10):
So what we have here is, the first time around, they didn’t take into account cooling.

(00:03:15):
Because I thought, well, it was a hard problem.

(00:03:18):
And I thought,

(00:03:19):
well,

(00:03:19):
you know,

(00:03:20):
some of these engineering challenges,

(00:03:21):
you just kind of kick down the road and you,

(00:03:23):
well,

(00:03:23):
we’ll deal with that later.

(00:03:24):
So we had seven fans, big fans, and each one of them could evacuate all the air in about 30 seconds.

(00:03:31):
So we figured seven of them will be able to evacuate all the air.

(00:03:34):
It turned out, though, that they evacuated the air and they pulled it in here, fluid dynamics, go figure.

(00:03:40):
So…

(00:03:42):
Fast forward,

(00:03:43):
we were going to ship this around the world just after open world,

(00:03:46):
but it takes a lot of power,

(00:03:48):
like a whole house of power.

(00:03:51):
I know, it’s comical.

(00:03:53):
And so it was kind of hard to get in the door, too, because it’s big.

(00:03:56):
So events teams, those combinations made it really hard.

(00:03:59):
So I made two 84 pi pi clusters that were a little easier to ship.

(00:04:03):
And when we built that, I solved the cooling problem.

(00:04:07):
So

(00:04:08):
I added five fans behind each one of these 21 pi 2U racks.

(00:04:13):
So that’s a lot more cooling then.

(00:04:16):
Yeah, you can feel it.

(00:04:17):
It feels like a space heater in front of you.

(00:04:21):
I’ve never seen a space heater quite this big and on both sides.

(00:04:25):
It’s silly.

(00:04:26):
So there’s 250 fans,

(00:04:28):
well,

(00:04:28):
257 fans now in this,

(00:04:31):
minus the ones in the switches and the servers and stuff.

(00:04:35):
And that cools it fine.

(00:04:36):
It’s been running now for three days.

(00:04:38):
Wow.

(00:04:39):
We haven’t shut it off.

(00:04:40):
I got a 1012 Pi running and I haven’t wanted to shut it off because you don’t know what happened.

(00:04:45):
Like I said,

(00:04:45):
when you network boot them,

(00:04:47):
sometimes there’s hiccups and stuff and there’s little bugs in the Pi firmware so

(00:04:51):
you don’t know if you’re actually going to be able to boot it.

(00:04:53):
So when one light doesn’t go on,

(00:04:56):
You don’t know if you’re going to be able to kick that one.

(00:04:59):
You know,

(00:04:59):
if you turn off all of them and turn it back on because they’re all 42 pi or

(00:05:02):
connected to one switch,

(00:05:04):
it’s all or nothing for those 42 pi.

(00:05:06):
So it takes about three hours to boot the whole thing because you’ve got to make

(00:05:09):
sure you do it nice and slow and make sure every pi has come online and everything.

(00:05:14):
So, yeah, what is the other upgrade?

(00:05:16):
We’ve got a plaque on the side for everyone that’s contributed.

(00:05:19):
That’s really important.

(00:05:21):
And it’s a pretty long list of people.

(00:05:23):
Actually,

(00:05:23):
I wanted to talk to you about,

(00:05:25):
to talk a little bit about the different teams at Oracle that you’ve worked with or

(00:05:28):
individuals that you’ve worked with to actually construct it and update it and

(00:05:33):
everything that’s been going on.

(00:05:34):
Yeah,

(00:05:34):
it’s really become one of those 10,

(00:05:37):
20% passion projects that people from across the org,

(00:05:42):
they get to…

(00:05:42):
You know,

(00:05:43):
people ask why.

(00:05:45):
Obviously, it’s just cool.

(00:05:47):
It kicks ass.

(00:05:48):
I mean, do I need to say any more?

(00:05:49):
It’s a show piece.

(00:05:52):
But…

(00:05:53):
You know, we need justification, and we work together with other engineering teams.

(00:05:57):
So there’s the Oracle Linux team,

(00:06:00):
there’s the Java team,

(00:06:02):
there’s the Oracle Labs team,

(00:06:03):
there’s OCI,

(00:06:04):
there’s Database.

(00:06:05):
So we’ve got… Yeah, that’s all the core projects of the company.

(00:06:09):
That’s pretty much everything except for apps.

(00:06:13):
There’s GBUs too, so like the SailGP, we couldn’t fit that on.

(00:06:18):
Well, you say that a lot of this is about passion.

(00:06:21):
I remember after the San Francisco event in 2019,

(00:06:25):
right after the event was over,

(00:06:28):
I heard in the hallways all throughout the company,

(00:06:31):
even people who were totally uninvolved with it were talking about it.

(00:06:35):
So that passion, it gets around.

(00:06:39):
I it has.

(00:06:40):
So we after 2019, you know, a certain pandemic happened and kind of things locked down a little bit.

(00:06:47):
And you helped me with a YouTube video.

(00:06:49):
I had I had scratch video from building this thing.

(00:06:54):
We had bought some cameras and we’re like, well, maybe we can do a video of this and

(00:06:58):
I remember you said, don’t you have video?

(00:06:59):
And I’m like, oh yeah, oh yeah.

(00:07:02):
So I put together a six minute video.

(00:07:04):
It was originally a lot longer that you kept telling me, cut it down, cut it down, cut it down.

(00:07:08):
So six minute video and it’s gotten nearly a million views now.

(00:07:11):
And I talk to people, people come here and they’re like,

(00:07:15):
I’ve seen you on YouTube.

(00:07:16):
I get that now with even neighbors’ kids.

(00:07:19):
It’s kind of interesting.

(00:07:23):
And since then, I’ve done other YouTube videos.

(00:07:24):
Like this has been in my garage for the last six months,

(00:07:28):
adding the fans and doing the software upgrades and everything.

(00:07:32):
And so it’s been hashtag big pie cluster in my garage.

(00:07:36):
And I got a little bored, to be honest.

(00:07:38):
And so there’s two of me sometimes, you know, video editing, talking to myself.

(00:07:43):
Some people have been kind of worried.

(00:07:44):
I’m OK.

(00:07:45):
No, I’m not.

(00:07:46):
I am.

(00:07:48):
You mentioned like your neighbor’s kid.

(00:07:50):
I mean,

(00:07:50):
that’s actually interesting because the because the Raspberry Pi community is

(00:07:54):
global and it’s very passionate and it’s massive as well.

(00:07:58):
So that’s also a part of the allure.

(00:08:00):
Yeah,

(00:08:00):
I’ve been having discussions with some of the Pi YouTubers and people in the

(00:08:06):
community like Eben Upton and Jeff Gerling and various people.

(00:08:10):
So it’s a tight community.

(00:08:12):
Everyone just wants good for Raspberry Pi and just makers in general.

(00:08:18):
I mean, all we’re…

(00:08:22):
It’s just fun.

(00:08:23):
You know, I mean, it’s just really fun.

(00:08:25):
Every single one of these pieces is 3D printed.

(00:08:27):
So it’s got all kinds of other technologies as well as Oracle technologies.

(00:08:33):
You know, obviously the Javas and the Graals and various things.

(00:08:38):
So you mentioned your 3D printing.

(00:08:40):
Talk a little bit about that.

(00:08:41):
Yeah, so all the caddies that hold each pie is 3D printed.

(00:08:46):
I designed that cad.

(00:08:48):
It’s actually available on Thingiverse for free.

(00:08:51):
Anybody can download it.

(00:08:53):
You know, I get contacted occasionally that someone’s built an 84-py pi cluster or just 21 pi.

(00:09:00):
I mean, there’s still a lot of pi, especially nowadays, to get hold of one.

(00:09:03):
Usually most of those happened before the last year or so or they had the pi sitting around.

(00:09:11):
But,

(00:09:11):
yeah,

(00:09:11):
people can build it themselves and they can modify the files too if they want to

(00:09:15):
modify it and do their own.

(00:09:17):
So it’s like an open source project as well.

(00:09:20):
In a lot of ways, yeah.

(00:09:20):
I mean,

(00:09:21):
all the code for this right now running on it,

(00:09:23):
what’s running on it right now is 100% open source.

(00:09:25):
You wrote a lot of that.

(00:09:27):
Wrote most of it, yeah.

(00:09:29):
We have other bits of it,

(00:09:31):
too,

(00:09:31):
because we have a digital twin piece of it,

(00:09:33):
so all the data that’s coming off of every single Pi.

(00:09:36):
The most important one is the temperature, the CPU usage, and the memory.

(00:09:42):
So that goes, as well as the processes and their PIDs, so we can kill a process and everything.

(00:09:49):
That all goes up into Autonomous Database via the REST API through ORDS.

(00:09:54):
So we got the JSON payload that’s coming through,

(00:09:57):
and so we can do the really nice queries right in the database.

(00:10:03):
And so that is all the data telemetry,

(00:10:06):
and then we have Oculus headset for VR,

(00:10:10):
so you can have a digital twin of the Pi cluster in your Oculus VR room.

(00:10:16):
When a light blinks here, it takes about 200 milliseconds for it to show up in the digital universe.

(00:10:22):
We have augmented reality,

(00:10:23):
so you can hold up an iPad over the Pi cluster,

(00:10:26):
and you can tap on a Pi so you can find out this one’s IP address.

(00:10:30):
We can’t do that.

(00:10:31):
Right now, it’s really hard.

(00:10:35):
It’s really hard if a Pi fails or we need to dig into a specific Pi.

(00:10:40):
If it’s not turned on, we can’t deal with it.

(00:10:43):
But I can find out the IP address of a Pi just with augmented reality.

(00:10:46):
That’s surreal.

(00:10:49):
What I’m hoping we can do… After this conference, it’s going to go into the Oracle Labs.

(00:10:55):
It’s going to sit in there for a bunch of research projects.

(00:10:58):
Some research students have been working on this already.

(00:11:01):
GraalVM has their native…

(00:11:05):
image for Python and for Java.

(00:11:09):
So we can compile that down to native image on an ARM device.

(00:11:13):
And so that’s going to be really good for Ampere and any ARM, M1, all that stuff.

(00:11:19):
And the research students that work on the Graal team have been working to do that.

(00:11:25):
They’re getting their doctorates in various things.

(00:11:27):
So it’s really cool to work with them.

(00:11:29):
Yeah, there’s a lot of PhDs over there.

(00:11:31):
Yeah, they’re pretty smart people.

(00:11:34):
And so this is going to go into the lab, and they’re going to be able to continue and further their work.

(00:11:40):
So they’ll be running testing on it, because it’s a pretty big cluster.

(00:11:44):
They’ll be able to run tons of tests on it.

(00:11:47):
The Java team is going to be able to do some research on it.

(00:11:50):
So think of Panama, be able to access the GPU.

(00:11:56):
That’ll be pretty cool.

(00:11:59):
There’s, what else, Oracle Linux.

(00:12:01):
I want to try and get a nice small Docker container version of Oracle Linux.

(00:12:07):
We’ll see what happens with that, crossing fingers.

(00:12:10):
So the point is there’ll be a neat research project for people to be able to work

(00:12:14):
on it,

(00:12:14):
as well as I’m hoping we can open it up a little bit to the community beyond just

(00:12:18):
what I have running on it.

(00:12:20):
What’s running on it right now,

(00:12:21):
the workload,

(00:12:22):
is actually a custom programming language that I wrote just for Twitter and the Pi cluster.

(00:12:27):
It only took a couple of days.

(00:12:28):
It was written in Python and I started off with an easy open source project.

(00:12:31):
But it’s called Warble.

(00:12:33):
You can tweet.

(00:12:35):
The idea was to be able to condense this down.

(00:12:37):
So, it’s like C Python meets Pascal.

(00:12:43):
And it’s a reduced amount of characters.

(00:12:47):
So it’s as small as you can write it.

(00:12:49):
And I wanted people to be able to post a tweet and calculate pi on the pi cluster,

(00:12:54):
running through the whole pi clusters.

(00:12:56):
And it’s actually displayed on an Apex database, or a page generated with Apex stored in the database.

(00:13:05):
just touching everything that Oracle does here.

(00:13:08):
I’ve been noticing you’ve been having conversations with people because we’re right

(00:13:12):
here on the show floor.

(00:13:13):
You are right here at the main entrance here.

(00:13:15):
Talk about some of the things that people are sort of asking you.

(00:13:18):
Give me like a summary of what you’re talking about with some of the people,

(00:13:22):
some of the developers that come by.

(00:13:24):
It’s a range.

(00:13:26):
From 3D printers, I was just having a conversation just before you came here.

(00:13:31):
There’s plenty of people that

(00:13:33):
Wow, once they find out, they’re like, wow, this is 3D printed.

(00:13:36):
That’s what I can do with a 3D printer.

(00:13:37):
I didn’t know that.

(00:13:39):
I’ve been refraining.

(00:13:42):
I mean,

(00:13:43):
don’t just print plastic to throw it away,

(00:13:45):
but if you want to print a pie case or anything,

(00:13:48):
why buy one when you can design your own?

(00:13:50):
Learn CAD and print it out or get a file.

(00:13:52):
There’s free files out there.

(00:13:54):
They’re not all good,

(00:13:55):
so it’s kind of hard,

(00:13:56):
but 3D printers are amazing,

(00:13:59):
and there’s a lot of conversations about that because anything blue on here,

(00:14:04):
is 3D printed.

(00:14:07):
That’s a lot.

(00:14:07):
Everything’s blue.

(00:14:08):
Yeah.

(00:14:11):
So, like, there’s pieces to turn on the switch.

(00:14:14):
You put magnets in it.

(00:14:16):
I don’t know if the camera can see that, but that clips in.

(00:14:19):
So, yeah, it’s a lot of engineering challenges with physical engineering.

(00:14:24):
People are very interested in pies and especially network booting the pies because

(00:14:29):
not very many people network boot a pie.

(00:14:30):
They put SD cards in it.

(00:14:32):
None of this has an SD card.

(00:14:34):
that’s really the key is and a lot of them are asking specific questions like there

(00:14:39):
was one on database you know when are we going to get ARM you know the Oracle

(00:14:43):
database on ARM so you know I’m having conversations with you know some of the

(00:14:48):
product managers and stuff so this is a community it’s not just you know I

(00:14:54):
personally I take this as I’m here with you know this project and representing

(00:15:00):
Oracle and

(00:15:04):
A lot of people that sit on a floor like this,

(00:15:06):
they go,

(00:15:06):
oh,

(00:15:07):
you know,

(00:15:07):
just contact our,

(00:15:08):
you know,

(00:15:08):
something department.

(00:15:10):
I take their emails back and their questions and I go contact those engineering people,

(00:15:13):
you know,

(00:15:13):
teams and stuff and get back to them.

(00:15:15):
If we have a specific bug, let’s fix it because they’re not the only ones having the problem.

(00:15:19):
So that’s one of the things that the big conversations we’ve had as well with people.

(00:15:25):
But they’re curious about the whole stack.

(00:15:27):
The whole entire stack is interesting from the engineering,

(00:15:29):
from the very,

(00:15:30):
you know,

(00:15:30):
just the physical thing to a Pi to all the way to what we’re running in the cloud

(00:15:35):
and how they can use some of these technologies.

(00:15:38):
All right, Chris.

(00:15:39):
Well, that’s a great summary.

(00:15:40):
Thank you very much.

(00:15:42):
Good luck with it today.

(00:15:43):
And we’re all looking forward to version 3 next year.

(00:15:46):
And so the world’s largest Raspberry Pi cluster is…

(00:15:54):
that we know of.

(00:15:56):
Thanks, Chris.

(00:15:56):
We’ll talk to you soon.