1 00:00:00,320 --> 00:00:09,500 *32C3 preroll music* 2 00:00:09,500 --> 00:00:16,240 Herald: Okay, welcome to our last talk in this hall today! 3 00:00:16,240 --> 00:00:20,420 It’s about Console Hacking and I guess that’s the reason why you are here. 4 00:00:20,420 --> 00:00:23,509 Console hacking has a long tradition at our great conference 5 00:00:23,509 --> 00:00:30,109 and we have seen lots of funny things. People doing stuff with Xboxes, 6 00:00:30,109 --> 00:00:33,900 Playstations and everything. 7 00:00:33,900 --> 00:00:39,010 Okay. Today we got a team which deals with the Nintendo DS, 8 00:00:39,010 --> 00:00:44,260 so give a warm applause for plutoo, derrek and smea! 9 00:00:44,260 --> 00:00:53,770 *applause* 10 00:00:53,770 --> 00:00:58,910 smea: Hi! I’m smea, this is plutoo, this is derrek, 11 00:00:58,910 --> 00:01:02,930 and today we are going to talk to you about our work on the Nintendo 3DS. 12 00:01:02,930 --> 00:01:05,390 So, the way this talk is going to be structured, is we are just going to 13 00:01:05,390 --> 00:01:08,850 go over all the hardware, organisation, software, like… 14 00:01:08,850 --> 00:01:12,240 Just give you a basic overview about how the system works. 15 00:01:12,240 --> 00:01:15,040 And after that we are going to go into 16 00:01:15,040 --> 00:01:18,330 basically every layer of security the system has, 17 00:01:18,330 --> 00:01:21,269 and break every one of them. 18 00:01:21,269 --> 00:01:23,219 *laughter* 19 00:01:23,219 --> 00:01:27,550 *applause* 20 00:01:27,550 --> 00:01:31,860 Okay. So, as you probably know, the 3DS, the original Nintendo 3DS 21 00:01:31,860 --> 00:01:36,500 was released in 2011. It’s a system that is kind of underpowered. 22 00:01:36,500 --> 00:01:41,479 It’s got, like… It’s got an ARM11 dual core CPU, 23 00:01:41,479 --> 00:01:46,399 268Mhz, it’s got a nice proprietary GPU, a bit of RAM, 24 00:01:46,399 --> 00:01:49,920 you know, the usual. It’s also backwards compatible with the DS games, 25 00:01:49,920 --> 00:01:55,299 which is nice. Then the new 3DS was released in 2014 and 2015, 26 00:01:55,299 --> 00:02:01,060 there was like different regions. And it was basically just the same console, 27 00:02:01,060 --> 00:02:04,239 just some improvements in the hardware. You’ve got a better CPU, 28 00:02:04,239 --> 00:02:09,410 it has got more cores. It’s faster, it has got more RAM. Basically everywhere. 29 00:02:09,410 --> 00:02:12,240 So, it is just the same thing, it runs the same software, exactly. 30 00:02:12,240 --> 00:02:15,800 It has got some exclusive software, but not much. 31 00:02:15,800 --> 00:02:19,460 So, in terms of a hardware overview, this is what what we are going to talk about 32 00:02:19,460 --> 00:02:24,050 looks like; in general. So you got the top part right here, 33 00:02:24,050 --> 00:02:27,490 which is what we are going to go into first. 34 00:02:27,490 --> 00:02:31,470 This is like the ARM11 part. 35 00:02:31,470 --> 00:02:35,110 Basically, you’ve got the ARM11, which is the main CPU. It runs 36 00:02:35,110 --> 00:02:40,740 the main operating system. It has 2 cores as I just said, or 4 cores. 37 00:02:40,740 --> 00:02:42,790 So, it runs the main operating system, it runs the games, 38 00:02:42,790 --> 00:02:45,340 it runs all the applications. Basically, it’s just – 39 00:02:45,340 --> 00:02:48,380 if you’re doing something on the 3DS that you can… you can see it happening, 40 00:02:48,380 --> 00:02:52,220 it’s happening on that CPU. It has got access to all of the main memory. 41 00:02:52,220 --> 00:02:56,090 So that includes FCRAM, 42 00:02:56,090 --> 00:03:01,040 which is 128MB or 256MB, 43 00:03:01,040 --> 00:03:04,730 depending on which model it is. And FCRAM is actually divided 44 00:03:04,730 --> 00:03:09,130 into 3 separate regions. So you first got the Application Region, 45 00:03:09,130 --> 00:03:12,520 which contains the currently running game or application. 46 00:03:12,520 --> 00:03:17,200 The System Region, which contains applets, which are basically tiny applications, 47 00:03:17,200 --> 00:03:20,050 which run in the background. So, that includes the home menu, 48 00:03:20,050 --> 00:03:23,390 which is actually always running in background, and the web browser, 49 00:03:23,390 --> 00:03:25,890 which you can actually run at the same time as your game, so 50 00:03:25,890 --> 00:03:28,860 it has to run there. And then you got the Base Region, which is more interesting. 51 00:03:28,860 --> 00:03:31,050 It contains all the system modules of the operating system, 52 00:03:31,050 --> 00:03:35,260 as well as some kernel data, such as handle tables 53 00:03:35,260 --> 00:03:39,840 and MMU tables. So it is kind of sensitive stuff. And then we got a WRAM, 54 00:03:39,840 --> 00:03:44,330 which is tiny and contains all the kernel code, and, well, 55 00:03:44,330 --> 00:03:49,550 most of the kernel structures as well. So it’s also an interesting target. 56 00:03:49,550 --> 00:03:55,160 Then we’ve got the lower part, which is the ARM9 part of the hardware. 57 00:03:55,160 --> 00:03:58,270 So the ARM9 is basically a separate, well… 58 00:03:58,270 --> 00:04:02,790 it’s an entirely separate CPU, which has access to… 59 00:04:02,790 --> 00:04:06,760 well… So it runs basically the same microkernel as the ARM11. 60 00:04:06,760 --> 00:04:11,600 It’s mostly the same code, it has just got some pure features. 61 00:04:11,600 --> 00:04:14,630 Mostly it runs a single process, which is called ‘Process9’, 62 00:04:14,630 --> 00:04:19,399 which does everything the ARM9 does. Beyond that the role of the ARM9 is 63 00:04:19,399 --> 00:04:24,260 to broker access to hardware that might be sensitive in terms of security. 64 00:04:24,260 --> 00:04:29,320 So one of the things it does is it brokers access to all storage media, 65 00:04:29,320 --> 00:04:33,590 so that includes the permanent storage as well as the SD card. 66 00:04:33,590 --> 00:04:38,450 And then it does all sorts of crypto stuff, which is really important, 67 00:04:38,450 --> 00:04:43,930 and does that by using hardware, actually. So there is this hardware key scrambler, 68 00:04:43,930 --> 00:04:48,260 which is used to.. to store secrets in hardware basically. 69 00:04:48,260 --> 00:04:51,100 The idea is, you feed it two separate keys, 70 00:04:51,100 --> 00:04:54,980 and it is going to generate a normal key and feed that directly 71 00:04:54,980 --> 00:04:59,260 into the hardware implementation of the AES algorithm. 72 00:04:59,260 --> 00:05:02,340 So that way, we never actually see the final keys. 73 00:05:02,340 --> 00:05:06,430 So that’s something that is kind of annoying. 74 00:05:06,430 --> 00:05:10,100 And then beyond that what you can see is: the ARM9 has access to all of main memory 75 00:05:10,100 --> 00:05:13,890 without much of, well, without any restrictions. But it has also got 76 00:05:13,890 --> 00:05:17,790 its own internal memory which the ARM11 does not have access to. 77 00:05:17,790 --> 00:05:21,350 So the ARM9 internal memory is where the ARM9 stores all its code, 78 00:05:21,350 --> 00:05:26,600 all of its data; and this way we can’t actually take over the ARM9 79 00:05:26,600 --> 00:05:33,340 just from the ARM11 without some kind of exploit. So it’s basically a security CPU. 80 00:05:33,340 --> 00:05:36,730 So this leads us to having 4 layers of security. 81 00:05:36,730 --> 00:05:39,940 Basically, you’re first going to have the ARM11 userland, which is what… 82 00:05:39,940 --> 00:05:43,550 well, like your games, your applications, whatever. On top of that, 83 00:05:43,550 --> 00:05:48,630 you’re going to have, well, below that, I guess, the ARM11 kernel. 84 00:05:48,630 --> 00:05:51,810 So that is going to have full privileges on the ARM11. 85 00:05:51,810 --> 00:05:55,300 And then you’re going to have ARM9 userland, which is ‘Process9’. 86 00:05:55,300 --> 00:05:59,560 Beyond that you’ll have ARM9 kernel mode. So that’s in theory. 87 00:05:59,560 --> 00:06:04,380 In practice, the microkernel has a system call, 88 00:06:04,380 --> 00:06:09,280 which we call… syscall… we call it ‘svc backdoor’. 89 00:06:09,280 --> 00:06:13,510 Because essentially you feed it a function pointer and it just executes 90 00:06:13,510 --> 00:06:16,970 that function in kernel mode. So you don’t even need an exploit 91 00:06:16,970 --> 00:06:20,889 if you have access to that syscall. Of course, on the ARM11 92 00:06:20,889 --> 00:06:25,300 no application or title or anything ever has access to that, 93 00:06:25,300 --> 00:06:29,560 but on the ARM9 ‘Process9’ actually has access to it. Which means, 94 00:06:29,560 --> 00:06:34,050 that from here we actually… well, userland and kernel mode 95 00:06:34,050 --> 00:06:37,770 are basically the same thing. When you got userland on the ARM9, 96 00:06:37,770 --> 00:06:41,020 you got kernel mode. So that’s nice. 97 00:06:41,020 --> 00:06:44,950 Beyond that, in terms of cryptography on the system, 98 00:06:44,950 --> 00:06:49,030 basically, they went out loud (?). So, anything that can be signed, is signed. 99 00:06:49,030 --> 00:06:51,570 So, that includes the firmware, that includes every application. 100 00:06:51,570 --> 00:06:55,480 Signatures are checked not only at install time but also at runtime, 101 00:06:55,480 --> 00:06:58,750 so that’s something to keep in mind. 102 00:06:58,750 --> 00:07:02,889 Same thing: anything that can be encrypted is encrypted. 103 00:07:02,889 --> 00:07:07,650 And anything that can be made, well, console-specific through cryptography 104 00:07:07,650 --> 00:07:13,270 or authentication, such as internal permanent storage 105 00:07:13,270 --> 00:07:17,510 or the data that is stored on the SD card, or savegames, 106 00:07:17,510 --> 00:07:22,740 or extra data for games, this is all made console-specific. 107 00:07:22,740 --> 00:07:26,510 And gamecard-specific in regards of savegames. 108 00:07:26,510 --> 00:07:31,470 So, that’s kind of annoying as well. And, of course, all this is handled by the ARM9 109 00:07:31,470 --> 00:07:35,590 using the hardware… the crypto hardware, so we got to get through that 110 00:07:35,590 --> 00:07:38,190 if we want to do interesting things. 111 00:07:38,190 --> 00:07:43,860 So, first we are going to go through the first layer, which is the ARM11 userland. 112 00:07:43,860 --> 00:07:47,320 Basically, getting a full hold onto the system. 113 00:07:47,320 --> 00:07:51,370 So, we first need to find some kind of entry point. 114 00:07:51,370 --> 00:07:55,780 There are problems… well, there are challenges there. 115 00:07:55,780 --> 00:07:59,760 One of the challenges is that the system implements 116 00:07:59,760 --> 00:08:05,080 strict Data Execution Prevention. So, existing pages will never be read… 117 00:08:05,080 --> 00:08:09,290 well, will never be read-write-executable. It’s all only going to be read-only, 118 00:08:09,290 --> 00:08:13,480 or read-writable or read-executable. There’s no way from a standard application 119 00:08:13,480 --> 00:08:18,079 to reprotect or map new pages that are read-write-executable. 120 00:08:18,079 --> 00:08:22,180 Because all of the system calls are locked out, except for 121 00:08:22,180 --> 00:08:26,400 higher privileged system modules. Another thing is 122 00:08:26,400 --> 00:08:29,840 that there is no ASLR, so that is not a challenge, that’s actually kind of nice. 123 00:08:29,840 --> 00:08:34,020 The nice thing here is that we… well, that makes savegame vulnerabilities 124 00:08:34,020 --> 00:08:37,010 totally fair game because, well, we don’t need an actual scripting environment 125 00:08:37,010 --> 00:08:40,640 or any kind of exotic vulnerability to exploit this. 126 00:08:40,640 --> 00:08:44,930 As long as we can get past DEP somehow. And then, 127 00:08:44,930 --> 00:08:48,990 of course, the fact that all savegames are both encrypted 128 00:08:48,990 --> 00:08:52,960 and made specific either to the gamecard or the game console, 129 00:08:52,960 --> 00:08:57,630 in the case of eShop games, is really annoying for savegame vulnerabilities 130 00:08:57,630 --> 00:09:01,450 because basically you can’t use those as an initial entry point in most cases, 131 00:09:01,450 --> 00:09:05,460 because, well, you can’t generate the right, well, ES MAC, 132 00:09:05,460 --> 00:09:12,160 or just… you don’t know the right cryptography. So, that’s annoying. 133 00:09:12,160 --> 00:09:15,300 Thankfully, the 3DS runs Webkit… 134 00:09:15,300 --> 00:09:18,470 *laughter* 135 00:09:18,470 --> 00:09:21,780 So, that’s nice. Can always use that. 136 00:09:21,780 --> 00:09:26,400 *applause* 137 00:09:26,400 --> 00:09:29,690 So, Webkit is used in a number of places, obviously it’s using the main web browser, 138 00:09:29,690 --> 00:09:32,810 which you can access from the home menu. It’s also used in the Youtube application, 139 00:09:32,810 --> 00:09:37,210 which is available free on the eShop and doesn’t use any kind of 140 00:09:37,210 --> 00:09:41,180 client side authentication for the server, so you can just redirect traffic through, 141 00:09:41,180 --> 00:09:46,589 like a DNS server for example. Miiverse applet, other stuff, that also uses it. 142 00:09:46,589 --> 00:09:50,870 Slightly more secure, but might be usable at some point, I don’t know. 143 00:09:50,870 --> 00:09:54,900 Anywho, the important part here, is that it’s not only using webkit, 144 00:09:54,900 --> 00:09:59,310 it is using a very old version of webkit. Basically, they do cherrypick 145 00:09:59,310 --> 00:10:03,290 some patches into the version of webkit they use, but only 146 00:10:03,290 --> 00:10:10,040 after we exploit those on release, so it comes a little too late, most of the time. 147 00:10:10,040 --> 00:10:15,690 So yeah, this has been used by multiple people, most notably yellows8, 148 00:10:15,690 --> 00:10:21,580 but it has proven to be a very efficient, reliable entry point. 149 00:10:21,580 --> 00:10:25,690 Beyond that, we got Cubic Ninja as initial entry point. Cubic Ninja is a game 150 00:10:25,690 --> 00:10:30,020 that was released in 2011 on Nintendo 3DS. It is nice, because it actually 151 00:10:30,020 --> 00:10:34,350 allows users to share levels that they make themselves 152 00:10:34,350 --> 00:10:40,850 through QR codes; and also it is really bad at parsing those levels. 153 00:10:40,850 --> 00:10:44,910 So what you can do, is just, well, manufacture your own QR code 154 00:10:44,910 --> 00:10:47,740 that is going to crash the game and give you access. So these are 155 00:10:47,740 --> 00:10:52,529 nice initial entry points. So, once we’ve got this, what we have to remember is 156 00:10:52,529 --> 00:10:56,020 that we might be able to crash the game and may be able to control registers, 157 00:10:56,020 --> 00:11:00,550 but we don’t actually have our code running because of that. So, 158 00:11:00,550 --> 00:11:04,200 the obvious solution to hit this, is to use ROP. 159 00:11:04,200 --> 00:11:07,770 For those of you, who are not familiar with ROP: 160 00:11:07,770 --> 00:11:11,730 You build your own fake stack that lets you return into 161 00:11:11,730 --> 00:11:15,899 code snippets that are located right before return instructions. That way… 162 00:11:15,899 --> 00:11:20,750 so this is an example. You can just 163 00:11:20,750 --> 00:11:24,779 jump to this kind of instruction, so ‘pop {r0, pc}’ and then 164 00:11:24,779 --> 00:11:29,220 this is going to let you load your own register value and then it is going to 165 00:11:29,220 --> 00:11:33,870 jump to the next instruction that you give it. So, this is a way of executing code 166 00:11:33,870 --> 00:11:37,580 without actually executing code, which is widely used; so this is like 167 00:11:37,580 --> 00:11:42,080 the obvious thing to do. Of course, ROP is annoying. It is very limiting. 168 00:11:42,080 --> 00:11:47,560 It can be enough to actually execute an exploit to get higher privileges, 169 00:11:47,560 --> 00:11:53,149 but overall it is just annoying and very limiting for homebrew, for example. 170 00:11:53,149 --> 00:11:56,000 And of course, as I mentioned earlier, we don’t have access to any of the system calls 171 00:11:56,000 --> 00:12:01,010 that would let us map read-writable-executable pages. 172 00:12:01,010 --> 00:12:04,850 Also, the system does support dynamically linked libraries, so that might be a way, 173 00:12:04,850 --> 00:12:09,560 but these are signed and checked in places that we can’t access at this point. 174 00:12:09,560 --> 00:12:13,959 So, what we’re going to look at next is the GPU to see 175 00:12:13,959 --> 00:12:19,070 if we use that to bypass that. What you can see here is that 176 00:12:19,070 --> 00:12:23,220 the GPU has access not only to video RAM, but also to FCRAM, 177 00:12:23,220 --> 00:12:26,420 which is, if you recall it, main memory. So, if you look at this, 178 00:12:26,420 --> 00:12:30,540 with all the different memory regions, 179 00:12:30,540 --> 00:12:33,480 we have got the Application Region here, which is entirely contained within 180 00:12:33,480 --> 00:12:38,700 what the GPU can access within FCRAM. Of course, the GPU can not actually access 181 00:12:38,700 --> 00:12:42,790 all of that FCRAM, so that is kind of limiting. What we can see here, 182 00:12:42,790 --> 00:12:49,279 is that, of course, application code is within range of the GPU’s level of access. 183 00:12:49,279 --> 00:12:53,250 The reason the GPU has access to FCRAM and Video RAM, through DMA, 184 00:12:53,250 --> 00:12:58,209 by the way, is, so that it can access information such as textures, 185 00:12:58,209 --> 00:13:01,030 vertex buffers, this sort of thing. 186 00:13:01,030 --> 00:13:04,240 So, it’s actually kind of important. And the reason it can write to it is because 187 00:13:04,240 --> 00:13:08,730 it has to render its data somewhere. The point is, that we can use this 188 00:13:08,730 --> 00:13:12,050 to render data into main memory. 189 00:13:12,050 --> 00:13:16,490 And main memory contains application code. And since the physical layout is 190 00:13:16,490 --> 00:13:20,200 actually completely deterministic, and even if it wasn’t, we could just use the 191 00:13:20,200 --> 00:13:23,580 read capabilities of the GPU to search for what we are looking for. 192 00:13:23,580 --> 00:13:27,970 Well, we can use this to overwrite our current application’s text section 193 00:13:27,970 --> 00:13:32,610 and we get code execution that way, in spite of DEP. 194 00:13:32,610 --> 00:13:34,440 Yeah, so this is where we get code execution… 195 00:13:34,440 --> 00:13:35,280 *applause* 196 00:13:35,280 --> 00:13:37,779 We execute our own, unsigned code, which is very… 197 00:13:37,779 --> 00:13:39,830 *applause* 198 00:13:39,830 --> 00:13:44,520 It’s great, but we are still confined within the application sandbox. 199 00:13:44,520 --> 00:13:47,450 So, we bypassed DEP, we are inside the sandbox. 200 00:13:47,450 --> 00:13:53,140 This means we can only access our current application’s savedata, 201 00:13:53,140 --> 00:13:58,120 so if we want to install some kind of secondary exploit, this is too limiting. 202 00:13:58,120 --> 00:14:02,190 We can only access certain services and system calls, which is also limiting 203 00:14:02,190 --> 00:14:06,200 and frustrating. And we can’t alter memory layout, so we can’t allocate 204 00:14:06,200 --> 00:14:08,769 more executable pages than I mentioned earlier. 205 00:14:08,769 --> 00:14:10,779 So, we are still kind of limited at this point. 206 00:14:10,779 --> 00:14:14,680 So, what we are going to do, is look at what else the GPU can access. 207 00:14:14,680 --> 00:14:18,630 And you can see, is that, of course, there is this entirely separate memory region 208 00:14:18,630 --> 00:14:21,780 the GPU can modify. 209 00:14:21,780 --> 00:14:24,860 So it can access most of the System Region. And the System Region contains 210 00:14:24,860 --> 00:14:27,510 a few things. It contains the home menu, as I mentioned, because that is an applet. 211 00:14:27,510 --> 00:14:31,500 It contains the internet browser, and it contains actually a single system module, 212 00:14:31,500 --> 00:14:38,020 which is called ‘NS’, which we think stands for ‘Nintendo Shell’, we don’t really know. 213 00:14:38,020 --> 00:14:42,810 So, let’s look at this. First we got NS code well beyond the GPU cutoff. 214 00:14:42,810 --> 00:14:46,110 We got menu code, which is also well beyond GPU cutoff. 215 00:14:46,110 --> 00:14:51,310 But we got the menu’s heap, right here, well, actually there is separate heaps, 216 00:14:51,310 --> 00:14:55,089 these are well within the GPU’s range, so that’s good. 217 00:14:55,089 --> 00:14:59,830 NS unfortunately is still well beyond the cutoff. All of its data, all of its code. 218 00:14:59,830 --> 00:15:03,059 So we apparently can’t get to that. 219 00:15:03,059 --> 00:15:07,830 So, then the idea is, to just, well, okay, so actually… 220 00:15:07,830 --> 00:15:11,029 What’s interesting here, is that the cutoff is right before the end of 221 00:15:11,029 --> 00:15:14,200 the System Region, which as we just saw, has some interesting things, but 222 00:15:14,200 --> 00:15:18,680 also excludes all of Base Region, which also has very interesting things. 223 00:15:18,680 --> 00:15:23,670 So, it seems likely that Nintendo knew about the capabilities of GPU DMA, 224 00:15:23,670 --> 00:15:27,480 like the theoretical capabilities, but they didn’t do anything about it. 225 00:15:27,480 --> 00:15:30,899 So, it seems that they probably didn’t realize what we could do with it, 226 00:15:30,899 --> 00:15:33,220 which is a lot. 227 00:15:33,220 --> 00:15:37,630 So, basically, we got menu heaps. So what we do, is… we have a heap, and 228 00:15:37,630 --> 00:15:42,399 this is all C++ code. We are just going to find objects inside the heap 229 00:15:42,399 --> 00:15:46,790 and overwrite it. So it’s pretty simple. Just find an object, that is going to be 230 00:15:46,790 --> 00:15:50,300 triggered to some kind of synchronisation mechanism. In this case, it’s gonna be 231 00:15:50,300 --> 00:15:55,010 just ‘Return to Menu’. And we create some kind of vague vtable 232 00:15:55,010 --> 00:15:59,560 and get it to run our own stack pivot. And then we get… 233 00:15:59,560 --> 00:16:03,300 we get ROP execution under Home menu, which is cool. 234 00:16:03,300 --> 00:16:07,060 We still don’t have code execution in the Home menu, but that’s okay. 235 00:16:07,060 --> 00:16:10,630 So, we can do a bunch of stuff from ROP. 236 00:16:10,630 --> 00:16:16,180 We can access a new system service, which is called ‘ns:s’, 237 00:16:16,180 --> 00:16:19,890 which is very helpful, because it can kill any arbitrary process, as well as 238 00:16:19,890 --> 00:16:24,930 create new ones. Also it gives us access to SD card, which most applications 239 00:16:24,930 --> 00:16:29,690 actually don’t have. And it lets us decrypt/dump any title on the system. 240 00:16:29,690 --> 00:16:34,300 So any game, even if it uses new cryptography that Nintendo introduced, 241 00:16:34,300 --> 00:16:38,230 we can actually dump that, because for some reason, well, Home menu 242 00:16:38,230 --> 00:16:41,890 apparently needs access to that. And then we can also 243 00:16:41,890 --> 00:16:47,490 access and overwrite all that extra data used by any application, which is great. 244 00:16:47,490 --> 00:16:50,380 So we use this as a base for running homebrew. 245 00:16:50,380 --> 00:16:54,920 Our homebrew launcher is essentially just a service 246 00:16:54,920 --> 00:16:58,810 that runs in the background under Home menu process. It is written in ROP, 247 00:16:58,810 --> 00:17:02,370 which is kind of disgusting, but it works. *laughter* 248 00:17:02,370 --> 00:17:05,999 The ‘Service’ handles running homebrew, so the process is very simple. You just 249 00:17:05,999 --> 00:17:09,358 kill off the current application, you spawn a new one, and then you take it over 250 00:17:09,358 --> 00:17:15,019 using the GPU DMA access. And then, what we do is 251 00:17:15,019 --> 00:17:19,489 we send all of these new capabilities that we got through handles to the new process 252 00:17:19,489 --> 00:17:23,558 and that gives us some higher privilege homebrew. 253 00:17:23,558 --> 00:17:30,190 It also handles events, such as Home button, Power button, all that good stuff. 254 00:17:30,190 --> 00:17:33,749 Which is nice, because we can actually run code under any arbitrary application 255 00:17:33,749 --> 00:17:37,929 or game, so we can actually modify these games. We can run ROM hacks. 256 00:17:37,929 --> 00:17:41,179 So there has been a bunch of translations that can be run through this, for games 257 00:17:41,179 --> 00:17:44,469 that haven’t come out outside of Japan, so that’s pretty nice. 258 00:17:44,469 --> 00:17:46,889 It’s the same principle, you just launch the app, you take it over, 259 00:17:46,889 --> 00:17:50,769 you pass the code, and then you jump to it, essentially. 260 00:17:50,769 --> 00:17:53,959 All within the confines of userland, which is nice. 261 00:17:53,959 --> 00:17:59,600 So, the other thing is, we can actually access any game or application’s data 262 00:17:59,600 --> 00:18:03,460 because we can run code under it. So, these things include 263 00:18:03,460 --> 00:18:07,970 savegame data for any game. So we can actually install more convenient 264 00:18:07,970 --> 00:18:11,980 secondary entry points, which do not rely on the browser, which can be 265 00:18:11,980 --> 00:18:15,749 patched any moment, or on some old game. 266 00:18:15,749 --> 00:18:21,019 So, some examples include ‘Menuhax’ by yellows8, which exploits 267 00:18:21,019 --> 00:18:27,539 faulty theme handling code, which was introduced in firmware 9.0. 268 00:18:27,539 --> 00:18:30,519 Which is really nice, because this way, you can actually just run homebrew 269 00:18:30,519 --> 00:18:35,359 right as Home menu is opened, so right on boot time, 270 00:18:35,359 --> 00:18:38,929 which is great. Then you got other games. Of course you got a Zelda game 271 00:18:38,929 --> 00:18:41,619 that’s vulnerable. *audience chuckles* 272 00:18:41,619 --> 00:18:44,549 This time it wasn’t the horse’s name, but pretty similar. 273 00:18:44,549 --> 00:18:48,389 And then you got other games. We got tons of entry points at this point. 274 00:18:48,389 --> 00:18:54,999 We’re really, literally drowning in them. So, this is nice. 275 00:18:54,999 --> 00:18:58,749 But we forgot about ‘Nintendo Shell’, right? It’s a very attractive target, 276 00:18:58,749 --> 00:19:03,090 for a couple of reasons. For one thing, it has access the ‘am:u’ service, 277 00:19:03,090 --> 00:19:05,929 which can be used to downgrade any system title. 278 00:19:05,929 --> 00:19:09,600 It’s not actually designed to downgrade titles, the thing is, you can both 279 00:19:09,600 --> 00:19:13,200 install and uninstall titles. So, what happens is, 280 00:19:13,200 --> 00:19:16,639 if you uninstall a title, and then install an older version 281 00:19:16,639 --> 00:19:19,210 of that title, you actually bypass the version check. 282 00:19:19,210 --> 00:19:22,210 So, you can just do that to downgrade any system title 283 00:19:22,210 --> 00:19:27,699 and bring back old exploits, if that is necessary. 284 00:19:27,699 --> 00:19:30,320 Assuming you have access to the service. 285 00:19:30,320 --> 00:19:32,679 And of course it’s in a region that we can partially modify, 286 00:19:32,679 --> 00:19:35,989 so it’s an interesting target. 287 00:19:35,989 --> 00:19:38,769 Unfortunately, we can’t actually access its data right now. 288 00:19:38,769 --> 00:19:42,489 But maybe we can actually move it to somewhere, where we can. 289 00:19:42,489 --> 00:19:47,830 The idea is, if you were to kill NS, and then allocate something in it’s place, 290 00:19:47,830 --> 00:19:52,129 then run NS again, you can move it below the cutoff. 291 00:19:52,129 --> 00:19:54,519 *laughter* 292 00:19:54,519 --> 00:20:01,809 *applause* 293 00:20:01,809 --> 00:20:06,369 Thanks. But unfortunately it’s not that simple. That can’t work. 294 00:20:06,369 --> 00:20:10,790 The reason being, that we actually need NS to be running to launch NS again. 295 00:20:10,790 --> 00:20:13,369 So that kind of sucks. 296 00:20:13,369 --> 00:20:15,820 But… well, no. Actually we also can’t run 297 00:20:15,820 --> 00:20:17,960 a second instance of NS at the same time, 298 00:20:17,960 --> 00:20:20,369 so we can’t do that either. 299 00:20:20,369 --> 00:20:23,559 But interestingly… Well, the 3DS has an interesting feature, 300 00:20:23,559 --> 00:20:28,200 which is called ‘Safe Mode’. Basically it’s a second firmware, which is 301 00:20:28,200 --> 00:20:32,649 an old version of the regular one, and that 302 00:20:32,649 --> 00:20:37,070 creates a bunch of copies of system titles. 303 00:20:37,070 --> 00:20:41,499 Most of them, anyways. So that gives it a different ID. So, the idea is, 304 00:20:41,499 --> 00:20:44,249 that if it has got a different ID, we might be able to run it at the same time, 305 00:20:44,249 --> 00:20:48,129 because, well, PM might fail to notice that. Of course it doesn’t. 306 00:20:48,129 --> 00:20:51,889 It actually does notice that. So we can’t run the Safe Mode version of a title 307 00:20:51,889 --> 00:20:54,830 at the sime time as the regular version of the title. But, 308 00:20:54,830 --> 00:20:59,960 for some reason, in the case of NS – you might not be able to see this very well, 309 00:20:59,960 --> 00:21:04,669 but we’ve got NS’s regular title right here, and then we got Safe Mode NS 310 00:21:04,669 --> 00:21:07,100 right here. And for some reason they created a new 3DS version 311 00:21:07,100 --> 00:21:12,070 of the Safe Mode version of NS, though there is no new 3DS version 312 00:21:12,070 --> 00:21:16,440 of the original NS. So that creates a separate title ID 313 00:21:16,440 --> 00:21:20,340 which we can run at the same time as regular NS. So then, the exploit 314 00:21:20,340 --> 00:21:25,059 becomes very simple. You keep NS running, just allocate enough data, that it will be 315 00:21:25,059 --> 00:21:29,440 below the cutoff; and then you just run new 3DS Safe Mode NS. 316 00:21:29,440 --> 00:21:33,239 And then it’s within range of the GPU and you can take it over and have 317 00:21:33,239 --> 00:21:36,979 access to everything. So, this is nice. 318 00:21:36,979 --> 00:21:43,509 It’s more of an oversight than a proper exploit, but whatever. 319 00:21:43,509 --> 00:21:46,399 So this gives us access to a bunch of system calls. Mostly 320 00:21:46,399 --> 00:21:50,909 service handling system calls, so we can post our own service, 321 00:21:50,909 --> 00:21:54,639 which can be useful for other exploits that I won’t get into, for 322 00:21:54,639 --> 00:21:59,190 impersonating other services to other system modules. 323 00:21:59,190 --> 00:22:02,570 And then we got access to all of these services, which is great. 324 00:22:02,570 --> 00:22:06,559 So we can downgrade system titles arbitrarily. 325 00:22:06,559 --> 00:22:10,529 And this runs in background, which can always be helpful for homebrew. 326 00:22:10,529 --> 00:22:14,210 The only problem is at this point, it’s still new 3DS only, because 327 00:22:14,210 --> 00:22:20,519 it relies on this new 3DS title. But there are actually ways around that. 328 00:22:20,519 --> 00:22:24,269 This was just to show that we can actually get fairly high levels of privilege, 329 00:22:24,269 --> 00:22:28,759 even still just always staying in userland on the ARM11. 330 00:22:28,759 --> 00:22:32,199 And there are other, similar attacks to that. If you’re interested you can look up 331 00:22:32,199 --> 00:22:36,489 ‘rohax’, which is a similar attack in the system module. 332 00:22:36,489 --> 00:22:41,229 So, now derrek is going to talk to you about exploiting the ARM11 kernel. 333 00:22:41,229 --> 00:22:52,279 derrek? *applause* 334 00:22:52,279 --> 00:22:55,319 derrek: So, hi everyone! 335 00:22:55,319 --> 00:22:59,530 First, I will give you some very short inside view 336 00:22:59,530 --> 00:23:05,059 of the kernel, and then I will explain how you can exploit 337 00:23:05,059 --> 00:23:09,269 the latest version of the ARM11 kernel. 338 00:23:09,269 --> 00:23:12,190 So, 339 00:23:12,190 --> 00:23:16,199 this is actually Nintendo’s very first gaming console kernel. 340 00:23:16,199 --> 00:23:20,679 Like on any other older console, 341 00:23:20,679 --> 00:23:26,200 there was no kernel. All games were just running on bare metal. 342 00:23:26,200 --> 00:23:31,499 Like there was a kernel for the Wii, 343 00:23:31,499 --> 00:23:36,209 like a very small microkernel running on the security processor, 344 00:23:36,209 --> 00:23:41,039 but that wasn’t written by Nintendo. 345 00:23:41,039 --> 00:23:44,830 So it’s their very first gaming console kernel. 346 00:23:44,830 --> 00:23:50,789 That kernel is made to be thread safe, 347 00:23:50,789 --> 00:23:54,830 so it can run on multiple cores 348 00:23:54,830 --> 00:23:58,679 at the same time and there are like 349 00:23:58,679 --> 00:24:02,659 130 system calls available. 350 00:24:02,659 --> 00:24:07,349 So that’s quite a lot, in my opinion. 351 00:24:07,349 --> 00:24:12,309 But usually, if you have gained execution 352 00:24:12,309 --> 00:24:16,999 in ARM11 userland, you only have access to, like, 353 00:24:16,999 --> 00:24:22,049 around 50 system calls. 354 00:24:22,049 --> 00:24:27,019 And there’s a reason for that, but I’m going to explain that in a second. 355 00:24:27,019 --> 00:24:34,210 So, internally, the kernel works with C++ objects. 356 00:24:34,210 --> 00:24:38,029 So here are some examples for system calls. So, we have 357 00:24:38,029 --> 00:24:43,539 ‘CreateSemaphore’, for example. That will just create 358 00:24:43,539 --> 00:24:47,259 a semaphore object in the kernel 359 00:24:47,259 --> 00:24:52,109 and it will return a handle to the userland. 360 00:24:52,109 --> 00:24:55,940 And when you want to do any operations 361 00:24:55,940 --> 00:24:59,879 on that semaphore, you have to pass that handle 362 00:24:59,879 --> 00:25:04,720 to the kernel, and it will look up this handle in a handle table 363 00:25:04,720 --> 00:25:10,919 to find the original C++ object. 364 00:25:10,919 --> 00:25:15,710 Also there are 2 different kinds of memory allocators. 365 00:25:15,710 --> 00:25:19,299 So, we have a memory allocator for the main memory, which is 366 00:25:19,299 --> 00:25:25,039 the FCRAM. And there is also a Slab Heap, 367 00:25:25,039 --> 00:25:29,869 where all the C++ objects are stored in. 368 00:25:29,869 --> 00:25:35,239 And this Slab Heap is located in FCRAM, 369 00:25:35,239 --> 00:25:39,339 which is the ARM11 memory, 370 00:25:39,339 --> 00:25:43,659 where all the kernel code and data is in. 371 00:25:43,659 --> 00:25:50,450 Also, there’s an IPC system. 372 00:25:50,450 --> 00:25:53,680 IPC is ‘inter process communication’. 373 00:25:53,680 --> 00:26:05,149 And it basically allows you to talk to other processes 374 00:26:05,149 --> 00:26:08,269 like services, 375 00:26:08,269 --> 00:26:17,270 e.g. the GSP service or FS. 376 00:26:17,270 --> 00:26:21,939 So, let’s look at the security. 377 00:26:21,939 --> 00:26:28,779 So, the kernel is really small. There are only like 200KB of code, 378 00:26:28,779 --> 00:26:34,649 which is pure ARM code. And there are only like 1000 functions. 379 00:26:34,649 --> 00:26:39,659 So, they try to keep the code size very low 380 00:26:39,659 --> 00:26:46,720 and that makes it harder to find bugs. 381 00:26:46,720 --> 00:26:51,999 The code size is really small, and 382 00:26:51,999 --> 00:26:57,349 you don’t have really much to choose from 383 00:26:57,349 --> 00:27:03,690 what to exploit. Also there are no symbols included in the kernel. 384 00:27:03,690 --> 00:27:11,629 Like when you run strings on it, it will just give you some names of C++ objects, 385 00:27:11,629 --> 00:27:16,389 but there are no function names or something like that. 386 00:27:16,389 --> 00:27:21,039 As we have seen earlier it’s physically isolated 387 00:27:21,039 --> 00:27:26,599 in its own memory. Which turned out - of course - to be a good idea. 388 00:27:26,599 --> 00:27:33,679 Otherwise it would have been overwritable by the CPU eventually. 389 00:27:33,679 --> 00:27:38,299 And all objects have a reference counting. 390 00:27:38,299 --> 00:27:43,450 So that’s similar to the C++ shared pointer 391 00:27:43,450 --> 00:27:49,809 where every object has a small field 392 00:27:49,809 --> 00:27:54,450 like a counter field and everytime the kernel wants to use an object 393 00:27:54,450 --> 00:27:59,899 this counter gets increased. And everytime the… 394 00:27:59,899 --> 00:28:04,239 like when the reference is no longer needed it will decrease the counter 395 00:28:04,239 --> 00:28:11,080 and when the counter reaches Zero it will automatically delete that object 396 00:28:11,080 --> 00:28:19,010 from the Slab Heap. So they are basically trying to prevent use after freeze. 397 00:28:19,010 --> 00:28:24,009 Also I’m not sure if that’s a security measurement 398 00:28:24,009 --> 00:28:29,690 but there are more than 100 panic calls in the kernel 399 00:28:29,690 --> 00:28:35,689 and that’s every 10th function 400 00:28:35,689 --> 00:28:44,019 - per average. And they have the syscall access restriction. 401 00:28:44,019 --> 00:28:51,909 So you - as I said - you only have access to like 50 system calls. 402 00:28:51,909 --> 00:28:55,189 All the interesting ones are disabled. 403 00:28:55,189 --> 00:29:01,729 E.g. you can’t map executable pages. 404 00:29:01,729 --> 00:29:06,039 On the other hand there is no ASLR. But at least 405 00:29:06,039 --> 00:29:11,809 they’re trying to change the memory mapping every time 406 00:29:11,809 --> 00:29:17,069 during a larger kernel update. 407 00:29:17,069 --> 00:29:22,549 Also there’s no stack protection. And the Userland is always mapped. 408 00:29:22,549 --> 00:29:29,059 So once you’ve got control over the program counter 409 00:29:29,059 --> 00:29:33,090 you can just jump to 410 00:29:33,090 --> 00:29:36,769 Userland pages that are marked as executable. 411 00:29:36,769 --> 00:29:40,899 So you don’t have to do ROP in the kernel. 412 00:29:40,899 --> 00:29:44,659 It’s pretty nice. 413 00:29:44,659 --> 00:29:50,599 But they tried to have an execution prevention 414 00:29:50,599 --> 00:29:57,810 in the kernel that is: they’re marking executable kernel pages 415 00:29:57,810 --> 00:30:01,899 – that is the code – they’re marking them as executable 416 00:30:01,899 --> 00:30:08,710 in their Page Table. So let’s take a look. 417 00:30:08,710 --> 00:30:14,819 The highlighted parts in orange are the kernel code sections. 418 00:30:14,819 --> 00:30:20,629 And as you can see like when looking at the first highlighted line 419 00:30:20,629 --> 00:30:24,909 it says ‘virtual address #FFF00’ etc. 420 00:30:24,909 --> 00:30:32,489 is mapped to the physical address 1FF80000. 421 00:30:32,489 --> 00:30:40,320 And it is marked as executable and you only have access to it 422 00:30:40,320 --> 00:30:45,219 in Kernel Mode, of course, and only Read access. Right? 423 00:30:45,219 --> 00:30:49,979 So this is correct. 424 00:30:49,979 --> 00:30:56,019 But when you look at the second line of that Page Table dump 425 00:30:56,019 --> 00:31:00,799 you will notice that there is another section 426 00:31:00,799 --> 00:31:05,960 which covers the entire AXI WRAM 427 00:31:05,960 --> 00:31:09,779 and it’s mapped as Read-Write. 428 00:31:09,779 --> 00:31:15,609 So it doesn’t really make sense. Yeah. 429 00:31:15,609 --> 00:31:23,939 So basically it’s completely useless. We have Read-Write access to it. 430 00:31:23,939 --> 00:31:28,430 So, to summarize everything, 431 00:31:28,430 --> 00:31:32,849 there’s actually no exploitation protection. Once we found 432 00:31:32,849 --> 00:31:38,700 an exploitable bug it’s pretty likely that we gain 433 00:31:38,700 --> 00:31:43,219 code execution in kernel mode. 434 00:31:43,219 --> 00:31:47,779 So, let’s find that bug. 435 00:31:47,779 --> 00:31:53,509 And I started at looking at the SVC table. 436 00:31:53,509 --> 00:31:59,809 So this is kind of the interface between kernel land and userland. 437 00:31:59,809 --> 00:32:05,889 And this shows all system calls 438 00:32:05,889 --> 00:32:11,369 that are available in the kernel. So you have like normal system calls. 439 00:32:11,369 --> 00:32:18,049 For memory management you can map read- and writable pages; 440 00:32:18,049 --> 00:32:25,119 you can mirror pages and do other memory management stuff. 441 00:32:25,119 --> 00:32:30,869 And there’s also some configuration for threads like 442 00:32:30,869 --> 00:32:37,589 you can choose which core should be used for 443 00:32:37,589 --> 00:32:41,450 executing the thread and all that stuff. 444 00:32:41,450 --> 00:32:47,219 You have a really large range of synchronization objects 445 00:32:47,219 --> 00:32:51,119 like kernel mute tags and all that stuff. And of course 446 00:32:51,119 --> 00:32:56,299 you have IPC requesting, so you can 447 00:32:56,299 --> 00:33:03,099 send messages to services. And there’s a more advanced section 448 00:33:03,099 --> 00:33:09,270 like this is used by services mostly, 449 00:33:09,270 --> 00:33:14,629 because they have to respond to your IPC requests. 450 00:33:14,629 --> 00:33:20,769 And there’s also Kernel DMA, cache control, some things. 451 00:33:20,769 --> 00:33:26,710 And they have a set of debug system calls. 452 00:33:26,710 --> 00:33:31,099 It’s just basic debugging. You can set breakpoints, 453 00:33:31,099 --> 00:33:36,429 read and write process memory. But you don’t have access to them. 454 00:33:36,429 --> 00:33:39,919 Like on retail it’s not actually used. 455 00:33:39,919 --> 00:33:47,099 And so one last section is the Privileged section. 456 00:33:47,099 --> 00:33:53,719 And here are all the interesting system calls 457 00:33:53,719 --> 00:34:00,260 that allow you to create processes and 458 00:34:00,260 --> 00:34:07,249 map executable memory and all that stuff. 459 00:34:07,249 --> 00:34:13,870 Unfortunately, we can’t use the Advanced, Debug and Privileged system calls. 460 00:34:13,870 --> 00:34:19,810 I mean that would require exploiting some service. 461 00:34:19,810 --> 00:34:24,020 And that’s just more work for us. 462 00:34:24,020 --> 00:34:29,130 So this leaves us with the normal system calls. 463 00:34:29,130 --> 00:34:33,760 But IPC sounds really interesting. 464 00:34:33,760 --> 00:34:41,239 But unfortunately it’s full of panics. 465 00:34:41,239 --> 00:34:49,570 Also there’s not much to attack at synchronization object system calls. 466 00:34:49,570 --> 00:34:59,470 So you only have like this more interesting system call 467 00:34:59,470 --> 00:35:06,520 for local memory management. And in theory there’s a lot that you can mess up. 468 00:35:06,520 --> 00:35:12,290 Right? There’s a lot that can possibly go wrong. And also we have 469 00:35:12,290 --> 00:35:17,030 unchecked DMA access! Like through the GPU. 470 00:35:17,030 --> 00:35:22,180 So maybe we can do something useful with that. 471 00:35:22,180 --> 00:35:26,430 Okay, so let’s have a look at the memory allocator. 472 00:35:26,430 --> 00:35:30,440 There are 2 types of memory allocators. 473 00:35:30,440 --> 00:35:37,080 First is the regular one. And it’s just for mapping normal heap 474 00:35:37,080 --> 00:35:43,700 like for malloc in C, e.g. And you have the linear memory allocator 475 00:35:43,700 --> 00:35:49,250 that is used for GPU textures, like 476 00:35:49,250 --> 00:35:55,080 when memory has to be physically continuous 477 00:35:55,080 --> 00:35:58,740 you use the linear memory allocator. 478 00:35:58,740 --> 00:36:03,910 And there’s the FCRAM memory layout that we saw earlier. 479 00:36:03,910 --> 00:36:09,920 You have these 3 regions and every region has 480 00:36:09,920 --> 00:36:14,930 its own set of free pages. 481 00:36:14,930 --> 00:36:21,740 So how are they keeping track of them? 482 00:36:21,740 --> 00:36:27,430 So you have a region descriptor which tells us the dimensions like: 483 00:36:27,430 --> 00:36:32,020 where does it start, the region, and its size. And you get also 484 00:36:32,020 --> 00:36:39,410 a pointer to the first free piece of memory 485 00:36:39,410 --> 00:36:47,230 in that region. And each free piece of memory 486 00:36:47,230 --> 00:36:53,650 which we call a Memchunk has a Memchunk header 487 00:36:53,650 --> 00:36:58,450 right at the beginning. And it basically tells the kernel 488 00:36:58,450 --> 00:37:03,850 how large that Memchunk is. And it’s also linked 489 00:37:03,850 --> 00:37:08,410 in a Doubly Linked List. So you have a next and previous pointer 490 00:37:08,410 --> 00:37:15,030 pointing to the next and previous Memchunk headers. 491 00:37:15,030 --> 00:37:20,970 It kind of looks like that. So you have the red parts 492 00:37:20,970 --> 00:37:29,170 which are the free Memchunks and the green parts are memory 493 00:37:29,170 --> 00:37:34,760 that is already allocated. So 494 00:37:34,760 --> 00:37:40,240 allocation is pretty straightforward. It’s not really complicated. 495 00:37:40,240 --> 00:37:45,900 So the first thing that the allocator function does: 496 00:37:45,900 --> 00:37:52,170 it loads the next free pointer from the region descriptor. 497 00:37:52,170 --> 00:37:59,230 And for regular memory it just goes through the list 498 00:37:59,230 --> 00:38:05,380 following the pointers and it sums up their size 499 00:38:05,380 --> 00:38:10,670 until the requested size is reached. For linear memory it would just 500 00:38:10,670 --> 00:38:17,120 look for a suitable memory chunk to make sure that the memory is really continuous. 501 00:38:17,120 --> 00:38:22,490 So when it found enough memory it sets the next pointer 502 00:38:22,490 --> 00:38:28,230 of the very last Memchunk to Zero. It will then 503 00:38:28,230 --> 00:38:33,690 update the list and also the next free pointer 504 00:38:33,690 --> 00:38:38,550 for the region descriptor and finally it will return 505 00:38:38,550 --> 00:38:44,780 a pointer to the first Memchunk. So, 506 00:38:44,780 --> 00:38:48,930 let’s look at this from a security perspective. 507 00:38:48,930 --> 00:38:53,410 And there’s a problem. They basically have kernel structures 508 00:38:53,410 --> 00:38:59,500 inside the FCRAM! And that is a problem 509 00:38:59,500 --> 00:39:03,930 because we have DMA access to it through the GPU. 510 00:39:03,930 --> 00:39:08,740 And there was an attack by yellows8 511 00:39:08,740 --> 00:39:13,180 that is called ‘memchunkhax’. And what he did 512 00:39:13,180 --> 00:39:17,060 is basically: he overwrote memchunk headers 513 00:39:17,060 --> 00:39:21,540 with the GPU DMA flaw. And then 514 00:39:21,540 --> 00:39:27,330 he gained an arbitrary kernel write 515 00:39:27,330 --> 00:39:31,710 when it’s deallocating memory. So because 516 00:39:31,710 --> 00:39:36,790 next/prev pointers have been modified. 517 00:39:36,790 --> 00:39:42,140 So, unfortunately, this was fixed by Nintendo 518 00:39:42,140 --> 00:39:47,600 in system update 9.3 last year, 519 00:39:47,600 --> 00:39:54,100 like 1 year ago. And the new kernel will now verify every memchunk header 520 00:39:54,100 --> 00:40:00,280 during allocation. Like its size and also next/prev pointers. 521 00:40:00,280 --> 00:40:08,160 So, in theory, everything has been fixed. Invalid pointers or invalid sizes 522 00:40:08,160 --> 00:40:16,870 will just result in a kernel panic. In theory. 523 00:40:16,870 --> 00:40:22,260 So when you look at the system call for Controlmemory… 524 00:40:22,260 --> 00:40:29,140 we have access to it. It’s one of the normal system calls. 525 00:40:29,140 --> 00:40:33,520 It does basic stuff. You can map/free RW pages, 526 00:40:33,520 --> 00:40:41,040 but not executable of course. And it takes an address and size as argument. 527 00:40:41,040 --> 00:40:46,530 And also an operation code which tells the kernel what to do: 528 00:40:46,530 --> 00:40:50,670 to map or free pages, whatever. 529 00:40:50,670 --> 00:40:55,590 So first it does some basic checks on the address 530 00:40:55,590 --> 00:41:01,710 and eventually it will call a very large function. 531 00:41:01,710 --> 00:41:08,640 And I just call that function kern::controlmemory. 532 00:41:08,640 --> 00:41:14,980 So what can kern::controlmemory: it calls the allocator function 533 00:41:14,980 --> 00:41:20,550 and it will just return a memchunk header pointer 534 00:41:20,550 --> 00:41:28,460 – as we have seen earlier. Then it goes through all of the allocated memchunks 535 00:41:28,460 --> 00:41:33,100 and it’s mapping them to user space. 536 00:41:33,100 --> 00:41:40,330 And it’s also updating some block information for KProcess object. 537 00:41:40,330 --> 00:41:47,490 So there’s a problem. There’s obviously a race condition. 538 00:41:47,490 --> 00:41:57,070 Like we can overwrite memchunk headers after they have been allocated. 539 00:41:57,070 --> 00:42:03,570 So we could try using the GPU but it’s really slow, actually, 540 00:42:03,570 --> 00:42:11,020 because we would have to ask the GSP service to read memory 541 00:42:11,020 --> 00:42:19,570 and we have to go to this very large IPC kernel code. 542 00:42:19,570 --> 00:42:26,730 And that would be probably too slow. Allocation is really fast. 543 00:42:26,730 --> 00:42:30,930 Let’s dig a little bit deeper. 544 00:42:30,930 --> 00:42:38,060 I tried to reconstruct the source code in C. 545 00:42:38,060 --> 00:42:44,040 So this is the first step. It tries to allocate memory. 546 00:42:44,040 --> 00:42:54,070 For this example, it will just allocate regular memory. 547 00:42:54,070 --> 00:42:58,510 So when it found a memchunk 548 00:42:58,510 --> 00:43:04,700 which means that it’s not enough memory is available. 549 00:43:04,700 --> 00:43:11,890 It will then execute this really interesting do-while loop. 550 00:43:11,890 --> 00:43:15,520 I know, it’s a lot of code. I’m not sure that you can actually read it. 551 00:43:15,520 --> 00:43:21,900 So let’s go quickly through this code. 552 00:43:21,900 --> 00:43:27,990 The pages read from the Memchunk header. It gets converted to a physical address. 553 00:43:27,990 --> 00:43:31,700 And that physical address gets mapped to userland 554 00:43:31,700 --> 00:43:38,980 by mem_map function. And then it will go to the next memchunk. 555 00:43:38,980 --> 00:43:45,410 Here. And it will also update the userland virtual address. 556 00:43:45,410 --> 00:43:49,500 And then it will clear that memory. So 557 00:43:49,500 --> 00:43:53,880 what’s wrong here? 558 00:43:53,880 --> 00:44:00,020 The problem is they’re mapping the Memorychunk into userland. 559 00:44:00,020 --> 00:44:05,770 And after it has been mapped they’re accessing it again. 560 00:44:05,770 --> 00:44:10,040 And what they access is the next pointer. 561 00:44:10,040 --> 00:44:13,250 So we can just overwrite it. 562 00:44:13,250 --> 00:44:19,509 When we have 2 threads running we can 563 00:44:19,509 --> 00:44:25,410 – from another CPU core – try to overwrite that pointer. 564 00:44:25,410 --> 00:44:32,320 So our goal would be to map kernel pages to userspace. 565 00:44:32,320 --> 00:44:37,510 But there are some problems. It requires really, really perfect timing. 566 00:44:37,510 --> 00:44:45,040 There’s only a very small time frame to do the overwrite. 567 00:44:45,040 --> 00:44:53,500 Also, we need a Memchunk header structure at the next pointer address… 568 00:44:53,500 --> 00:45:00,710 …to do this. To make sure we get a perfect timing 569 00:45:00,710 --> 00:45:06,810 I came up with a kernel address arbiter oracle. 570 00:45:06,810 --> 00:45:11,650 It is actually used for thread synchronization, we don’t care about it. 571 00:45:11,650 --> 00:45:15,430 But it tries to read from address and returns an error when the address is 572 00:45:15,430 --> 00:45:23,860 not accessible by userland. So we can use that system call 573 00:45:23,860 --> 00:45:28,600 to make sure that the memory has been mapped to userland. 574 00:45:28,600 --> 00:45:32,260 And once it has been mapped we’re trying to overwrite it. 575 00:45:32,260 --> 00:45:38,080 So one last problem: we have to inject a memory chunk error 576 00:45:38,080 --> 00:45:44,720 in kernel. I did this by using the Slab Heap. 577 00:45:44,720 --> 00:45:50,720 We can just create some KObject and set their member variables 578 00:45:50,720 --> 00:45:56,170 to create a faked memchunk header. 579 00:45:56,170 --> 00:46:00,430 So this is the Slab Heap. We’ve got C++ objects, 580 00:46:00,430 --> 00:46:04,680 vtable pointer and some attributes. 581 00:46:04,680 --> 00:46:11,200 So the Slab Heap is basically just a really large area of C++ objects. 582 00:46:11,200 --> 00:46:17,030 And what I did was I changed the attributes 583 00:46:17,030 --> 00:46:22,170 and used them as Memchunk header. And I am redirecting 584 00:46:22,170 --> 00:46:29,950 the next-pointer to that object and it will map 585 00:46:29,950 --> 00:46:34,410 multiple C++ objects to userland. And that’s really nice because 586 00:46:34,410 --> 00:46:40,180 we have vtable pointers, so we can just overwrite them. 587 00:46:40,180 --> 00:46:44,440 And that means that we gain code execution. 588 00:46:44,440 --> 00:46:49,570 So, as a summary, we set up some kernel objects, 589 00:46:49,570 --> 00:46:52,840 change their attributes, request memory from the kernel; 590 00:46:52,840 --> 00:46:57,290 and once it becomes available we patch the next-pointer, 591 00:46:57,290 --> 00:47:02,100 overwrite that mapped SlabHeap pages and 592 00:47:02,100 --> 00:47:08,060 then we call a system call which closes the handle 593 00:47:08,060 --> 00:47:11,940 for the kernel objects that we created in step one. 594 00:47:11,940 --> 00:47:17,470 So it will eventually call some vtable function 595 00:47:17,470 --> 00:47:23,560 and it will just jump to our modified vtable function. 596 00:47:23,560 --> 00:47:29,380 And we got ARM11 Level0 Code Execution!! 597 00:47:29,380 --> 00:47:38,750 *applause, motivated by smea* 598 00:47:38,750 --> 00:47:43,880 So, now plutoo will tell us what nice things you can do 599 00:47:43,880 --> 00:47:47,310 once you gained ARM11 Code execution. 600 00:47:47,310 --> 00:47:55,060 plutoo: Hey guys! Okay, so… the ARM9. 601 00:47:55,060 --> 00:47:58,990 Let’s go. 602 00:47:58,990 --> 00:48:05,500 The ARM9 is actually also used for executing old DS games. 603 00:48:05,500 --> 00:48:10,390 So what they do is, they actually, you could say, reused the ARM9 604 00:48:10,390 --> 00:48:14,210 which is their backwards compatibility processor. They use it 605 00:48:14,210 --> 00:48:21,130 as a security processor when executing 3DS code. 606 00:48:21,130 --> 00:48:24,890 And like smea said it’s running a stripped-down version 607 00:48:24,890 --> 00:48:30,700 of the ARM11 kernel. It basically only does threading sequencation, 608 00:48:30,700 --> 00:48:35,460 things like that. And there’s no MMU. There’s an MPU, 609 00:48:35,460 --> 00:48:39,560 8 regions you can configure. 610 00:48:39,560 --> 00:48:46,210 You could do no-execute within those regions etc. but 611 00:48:46,210 --> 00:48:50,280 the granularity is not very nice. And they only have 8. 612 00:48:50,280 --> 00:48:55,390 So they basically ran out of space. And .data+stack is executable 613 00:48:55,390 --> 00:49:00,020 as long as you can jump to it. And .text is writable 614 00:49:00,020 --> 00:49:06,240 so that’s bad. Basically whenever you can 615 00:49:06,240 --> 00:49:11,940 write code into arbitrary memory you can just overwrite code. 616 00:49:11,940 --> 00:49:16,250 These features – you don’t want them on a security processor. 617 00:49:16,250 --> 00:49:18,430 *laughter* 618 00:49:18,430 --> 00:49:23,740 So let’s go. So it turns out that 619 00:49:23,740 --> 00:49:28,040 there have been lots of exploits over the years and most of them are fixed. 620 00:49:28,040 --> 00:49:33,330 And most of them used the normal command interface. 621 00:49:33,330 --> 00:49:37,940 But in this case we’re taking a different approach. So 622 00:49:37,940 --> 00:49:42,730 on the 3DS the memory-mapped I/O is split up into 3 regions. 623 00:49:42,730 --> 00:49:46,420 There’s the ARM9-only I/O: it does crypto, 624 00:49:46,420 --> 00:49:50,980 it does DMA engine, 625 00:49:50,980 --> 00:49:54,760 things like that. Then there’s the Shared I/O region. 626 00:49:54,760 --> 00:49:58,030 And then, finally, there’s the ARM11 I/O region which contains 627 00:49:58,030 --> 00:50:02,760 the GPU video decoder. 628 00:50:02,760 --> 00:50:06,310 Thanks to derrek and smea we have full ARM11 control. 629 00:50:06,310 --> 00:50:09,680 We execute kernel mode. 630 00:50:09,680 --> 00:50:13,280 So the question is: can we use the shared I/O region, somehow, 631 00:50:13,280 --> 00:50:17,750 to own the ARM9? So it turns out 632 00:50:17,750 --> 00:50:21,550 the interface for reading old DS cartridges is actually 633 00:50:21,550 --> 00:50:24,940 in the shared I/O region. 634 00:50:24,940 --> 00:50:30,260 We’re not sure why this is, but 635 00:50:30,260 --> 00:50:33,970 they have it there for some reason. And it’s only the ARM9 636 00:50:33,970 --> 00:50:38,120 which is actually using this region. But ARM11 still has access to it. 637 00:50:38,120 --> 00:50:43,780 So when you insert the cartridge it starts by reading the banner. 638 00:50:43,780 --> 00:50:49,100 And it does this by writing this magic value to CTRL register. 639 00:50:49,100 --> 00:50:53,940 And basically it just asks for 0x200 [hex] bytes. 640 00:50:53,940 --> 00:50:56,490 And then there’s this loop. 641 00:50:56,490 --> 00:50:59,770 And this Assembler code is on the right side. 642 00:50:59,770 --> 00:51:04,640 You can see it basically waits for some bits to clear / to set 643 00:51:04,640 --> 00:51:11,170 and then they read 4 bytes and then they wait for another bit. 644 00:51:11,170 --> 00:51:15,520 And there’s no range check on the buffer. But it’s always 200 bytes, 645 00:51:15,520 --> 00:51:20,540 so it should be fine. 646 00:51:20,540 --> 00:51:24,510 What if we overwrite the CTRL register from ARM11 647 00:51:24,510 --> 00:51:27,880 asking for 0x4000 bytes? 648 00:51:27,880 --> 00:51:32,080 Boom! 649 00:51:32,080 --> 00:51:36,490 We have a nice buffer overrun. It’s in the DSS segment but… 650 00:51:36,490 --> 00:51:40,690 it’s still nice. And can control the data. 651 00:51:40,690 --> 00:51:48,110 So the data actually comes from the cartridge. 652 00:51:48,110 --> 00:51:51,720 We need to make our own DS cartridge. So, 653 00:51:51,720 --> 00:51:56,030 there’s this old device, called the PassMe. It’s for the original DS, 654 00:51:56,030 --> 00:51:59,850 where you basically plug old DS cartridge in 655 00:51:59,850 --> 00:52:03,960 and it basically modifies the header as its read. So, 656 00:52:03,960 --> 00:52:08,620 these are available online for 5 bucks. 657 00:52:08,620 --> 00:52:15,480 And then you add an FPGA. 658 00:52:15,480 --> 00:52:21,150 I implemented this and it works, but it’s very gimmicky. 659 00:52:21,150 --> 00:52:26,290 I don’t recommend it. 660 00:52:26,290 --> 00:52:30,790 And here’s my soldering, it’s not very nice. 661 00:52:30,790 --> 00:52:35,730 This gives us ARM9 code execution and this works on latest firmware. 662 00:52:35,730 --> 00:52:41,370 But we want something better. Let’s look at the chain of trust. 663 00:52:41,370 --> 00:52:46,620 The chain of trust: the idea is of course, you verify all the code that is running. 664 00:52:46,620 --> 00:52:51,560 But you’re basically verifying everything at load time. 665 00:52:51,560 --> 00:52:55,230 The 3DS has the simplest chain of trust you can have. 666 00:52:55,230 --> 00:52:58,560 There’s the Boot ROM at the start. And then it loads 667 00:52:58,560 --> 00:53:04,490 the firmware binary from NAND and it jumps to it. 668 00:53:04,490 --> 00:53:07,900 On the new 3DS they were a bit clever. 669 00:53:07,900 --> 00:53:12,760 They added an extra crypto layer on the ARM9 portion. 670 00:53:12,760 --> 00:53:17,520 But it’s actually part of the firmware binary. 671 00:53:17,520 --> 00:53:20,380 We call this ‘ARM9 loader’. 672 00:53:20,380 --> 00:53:23,530 So the theory that Nintendo had was: 673 00:53:23,530 --> 00:53:27,460 “Let’s add another layer of crypto, so we change the keys, 674 00:53:27,460 --> 00:53:32,470 we introduce new keys, and they can’t break it”. 675 00:53:32,470 --> 00:53:35,560 And they don’t have any worked-out place to put those keys. 676 00:53:35,560 --> 00:53:39,200 So they placed them in NAND! 677 00:53:39,200 --> 00:53:42,760 But they’re encrypted with the per-Console key that’s 678 00:53:42,760 --> 00:53:48,030 based on a hash of the OTP that’s unique for each Console. 679 00:53:48,030 --> 00:53:52,120 And then OTP access is disabled early in the Boot. 680 00:53:52,120 --> 00:53:59,410 So later on you can’t dump the OTP and you can’t figure out the keys. 681 00:53:59,410 --> 00:54:03,580 This looks safe, in theory. So here’s the implementation. 682 00:54:03,580 --> 00:54:08,620 So they calculate some hash of the OTP. They read the key-sector from NAND. 683 00:54:08,620 --> 00:54:12,430 And they decrypt the key. And they put it in a keyslot. 684 00:54:12,430 --> 00:54:17,180 It’s basically an isolated memory area. 685 00:54:17,180 --> 00:54:21,170 And then they generate a bunch of sub keys and 686 00:54:21,170 --> 00:54:24,620 they verify that the key they loaded from NAND is the correct one. 687 00:54:24,620 --> 00:54:30,810 So even if we were to switch the key they would detect that and just panic. 688 00:54:30,810 --> 00:54:35,300 And then they decrypt the ARM9 binary and they jump to the entry point. 689 00:54:35,300 --> 00:54:40,420 But… they forgot to clear the 0x11 key! 690 00:54:40,420 --> 00:54:44,190 So we can just get code execution later on. And we can just regenerate 691 00:54:44,190 --> 00:54:51,460 all those keys! So this implementation is useless. 692 00:54:51,460 --> 00:54:52,760 Okay. *laughs* 693 00:54:52,760 --> 00:54:58,960 *applause* 694 00:54:58,960 --> 00:55:03,760 And they fixed this because they have more than 1 key hidden in the NAND. 695 00:55:03,760 --> 00:55:07,780 So they took their next key. 696 00:55:07,780 --> 00:55:10,680 It’s basically the same idea: you calculate the same hash, you read 697 00:55:10,680 --> 00:55:14,920 the key sector from NAND, you generate all the previous keys for compatibility, 698 00:55:14,920 --> 00:55:19,900 and then you decrypt a new key, we call it Key#2. 699 00:55:19,900 --> 00:55:23,920 And then you decrypt ARM9 binary using the second key. 700 00:55:23,920 --> 00:55:27,780 You clear the keyslot, and you jump to entry point. 701 00:55:27,780 --> 00:55:32,010 But they forgot to verify the second key! *audience laughs* 702 00:55:32,010 --> 00:55:40,000 This is epic fail! *applause* 703 00:55:40,000 --> 00:55:44,520 So let’s exploit this. ‘ARM9LOADERHAX’. 704 00:55:44,520 --> 00:55:49,510 We can change the second key. ARM9 loader will just decrypt the binary 705 00:55:49,510 --> 00:55:54,820 to garbage and jump to it. 706 00:55:54,820 --> 00:56:00,110 If you look at the encoding of a ARM Branch instruction: 707 00:56:00,110 --> 00:56:04,310 the probability is pretty high that there will just be a Branch instruction. 708 00:56:04,310 --> 00:56:08,590 And just any random data will eventually… like if you try enough keys, 709 00:56:08,590 --> 00:56:14,810 it will eventually become a Branch instruction to some memory. 710 00:56:14,810 --> 00:56:19,490 So if we try a lot of keys, eventually we will find some garbage 711 00:56:19,490 --> 00:56:23,990 that is useful. 712 00:56:23,990 --> 00:56:29,680 This is the NAND of the Flash memory of an unmodified 3DS 713 00:56:29,680 --> 00:56:37,349 – a new 3DS. So there’s a small key section, marked in teal, like, blue. 714 00:56:37,349 --> 00:56:41,660 And it contains those keys that we’re talking about. 715 00:56:41,660 --> 00:56:44,550 And then there are 2 firmware partitions. 716 00:56:44,550 --> 00:56:47,960 One is used for backup, in case one gets corrupted; 717 00:56:47,960 --> 00:56:52,119 so it doesn’t brick the device, whatever. 718 00:56:52,119 --> 00:56:57,190 We installed our custom key. 719 00:56:57,190 --> 00:57:00,920 And we installed the largest firm binary we have 720 00:57:00,920 --> 00:57:06,100 in the firm0 partition. And we keep the one with the vulnerability 721 00:57:06,100 --> 00:57:11,760 in the firm1 partition. And then we put our code payload 722 00:57:11,760 --> 00:57:17,250 on top of the firmware0 binary. 723 00:57:17,250 --> 00:57:21,340 And then we reboot. And so what will happen? 724 00:57:21,340 --> 00:57:24,070 The Bootrom is executed. 725 00:57:24,070 --> 00:57:29,660 It will load the first firmware partition. 726 00:57:29,660 --> 00:57:34,510 And it has our code in the end, but it doesn’t know about it. 727 00:57:34,510 --> 00:57:38,880 And then it decrypts it. And, you see, it looks okay. 728 00:57:38,880 --> 00:57:43,800 There’s the ARM9 loader stub in the front; and then comes the encrypted binary. 729 00:57:43,800 --> 00:57:48,170 And then, finally, there’s our payload. 730 00:57:48,170 --> 00:57:52,960 But Bootrom checks the hash, right? And it fails. 731 00:57:52,960 --> 00:57:58,280 So it thinks the partition got corrupted. 732 00:57:58,280 --> 00:58:03,000 So it will load the smaller one on top. You see we have our payload in memory, 733 00:58:03,000 --> 00:58:09,380 at Boot. And then it decrypts firmware1 734 00:58:09,380 --> 00:58:14,810 which is smaller and it still has ARM9 loader and another encrypted ARM9 binary. 735 00:58:14,810 --> 00:58:18,910 And then it jumps to ARM9 loader because the hash checks out. 736 00:58:18,910 --> 00:58:24,230 And then the ARM9 loader will decrypt our corrupted key 737 00:58:24,230 --> 00:58:28,940 from NAND and it will decrypt this one to garbage 738 00:58:28,940 --> 00:58:37,100 and it will jump to it. And hopefully it jumps to our code. 739 00:58:37,100 --> 00:58:41,770 So this gives us ARM9 code execution from cold Boot. 740 00:58:41,770 --> 00:58:46,230 Early, very early. So it turns out we can actually use this to get some keys 741 00:58:46,230 --> 00:58:52,000 that are later not available because they clear those… 742 00:58:52,000 --> 00:58:56,869 they use a certain memory area for seeding encryption engine to generate keys 743 00:58:56,869 --> 00:59:04,440 and the memory is later cleared. So you can’t regenerate the keys. 744 00:59:04,440 --> 00:59:08,400 But with this we can actually get those 2 keys. 745 00:59:08,400 --> 00:59:11,850 They’re called the firmware 6.x save-key 746 00:59:11,850 --> 00:59:15,780 and firmware 7.x NCCH-key. 747 00:59:15,780 --> 00:59:20,400 That’s a bonus. 748 00:59:20,400 --> 00:59:25,220 We talked a bit about the AES engine. It’s used everywhere for the crypto 749 00:59:25,220 --> 00:59:30,200 and it’s used for everything, basically. 750 00:59:30,200 --> 00:59:35,990 It supports all the usual block cipher modes. 751 00:59:35,990 --> 00:59:40,940 It has 2 security features: it has write-only keys. Which is really useful. 752 00:59:40,940 --> 00:59:44,750 Like you write a key and then you can never ever read it back. 753 00:59:44,750 --> 00:59:49,770 This means that they can fill in the keys by the Bootrom 754 00:59:49,770 --> 00:59:56,150 and we can’t dump them later. 755 00:59:56,150 --> 01:00:01,300 So they can keep the keys secret. 756 01:00:01,300 --> 01:00:08,280 Even if we hacked the ARM9, even if we get code execution we’ll never get the keys. 757 01:00:08,280 --> 01:00:12,250 And then there’s the key scrambler. Which is that the key is actually 758 01:00:12,250 --> 01:00:16,320 – it’s an optional thing – where the actual key is hidden, 759 01:00:16,320 --> 01:00:21,090 calculated by a hardware function, that is never… 760 01:00:21,090 --> 01:00:26,359 that we don’t know about. So the key is actually never exposed to the CPU 761 01:00:26,359 --> 01:00:30,580 – the actual key. So we just feed it 2 values, 2 keys and then it generates 762 01:00:30,580 --> 01:00:35,000 a new key based on that. And we don’t know what that key is. 763 01:00:35,000 --> 01:00:40,500 So this creates a situation similar to the isolated SPUs on the PS3 764 01:00:40,500 --> 01:00:44,000 where you can ask it to decrypt stuff, but you don’t get the keys. 765 01:00:44,000 --> 01:00:49,640 And if you don’t get the keys, then… we want the keys!! 766 01:00:49,640 --> 01:00:53,300 We want to decrypt things on our PC because we’re lazy. 767 01:00:53,300 --> 01:00:57,720 So there’re 2 keys – KeyX, KeyY we call them. 768 01:00:57,720 --> 01:01:01,970 They’re 128bits and the normal key is derived 769 01:01:01,970 --> 01:01:06,250 as a function of those 2; and that function is unknown. 770 01:01:06,250 --> 01:01:12,040 It’s implemented in hardware, in silicon. 771 01:01:12,040 --> 01:01:15,760 So even if we know X and Y we can’t figure out the normal key 772 01:01:15,760 --> 01:01:21,960 and we can’t decrypt things without asking the 3DS first. 773 01:01:21,960 --> 01:01:26,550 But we can poke this hardware engine. 774 01:01:26,550 --> 01:01:30,050 The first thing you notice when you do this is that if you set the N-th bit 775 01:01:30,050 --> 01:01:37,140 of the X key and the N+2 bit in the Y key you get the same result. 776 01:01:37,140 --> 01:01:41,080 And in general, you find that the function that we’re looking for 777 01:01:41,080 --> 01:01:45,280 is actually just a function of one variable where it’s 778 01:01:45,280 --> 01:01:50,690 the XOR between the X rotated by 2… 779 01:01:50,690 --> 01:01:56,100 so this is rotation, not shift, and XOR-ed with Y. 780 01:01:56,100 --> 01:01:59,430 But we still don’t know the key. But we want to know keys. So… 781 01:01:59,430 --> 01:02:08,140 So step back a little bit. 782 01:02:08,140 --> 01:02:12,070 The keyscrambler is used for Mii QR-codes. 783 01:02:12,070 --> 01:02:18,740 It’s used for everything, right? So it’s used for network protocol, called UDS, 784 01:02:18,740 --> 01:02:23,930 and it’s used for Download Play – which is when you download games over WiFi, 785 01:02:23,930 --> 01:02:28,000 temporary games. But the Wii U also supports all of this. 786 01:02:28,000 --> 01:02:31,180 But it doesn’t have the key scrambler in hardware. 787 01:02:31,180 --> 01:02:33,090 So the Wii U must be using normal keys. 788 01:02:33,090 --> 01:02:36,520 *applause* *screamed from audience: WHAT?* 789 01:02:36,520 --> 01:02:46,360 *applause* 790 01:02:46,360 --> 01:02:51,210 So we make a table of the shared keys and 791 01:02:51,210 --> 01:02:54,619 these are the 3 keys that are shared with the Wii U. 792 01:02:54,619 --> 01:03:00,240 Who is where the KeyX and KeyY on the 3DS… 793 01:03:00,240 --> 01:03:05,920 where they are set. And 2 of them have KeyY set by firmware. 794 01:03:05,920 --> 01:03:11,510 So we can’t read the keys set by the Bootrom because it’s locked away 795 01:03:11,510 --> 01:03:17,310 and we don’t have it. But can we still figure out G? Let’s see. 796 01:03:17,310 --> 01:03:23,390 So I gave shoutout to shuffle2 and to fail0verflow who hacked the WiiU 797 01:03:23,390 --> 01:03:27,540 and they helped us… or shuffle helped us extract the Wii U keys. 798 01:03:27,540 --> 01:03:36,670 So thank you! Now we have KeyY and we know the normal key from the Wii U. 799 01:03:36,670 --> 01:03:39,740 However, KeyX is still unknown. 800 01:03:39,740 --> 01:03:44,560 And if G(t) is ‘bad’ then a small change in the KeyY 801 01:03:44,560 --> 01:03:48,970 will only lead to a small change in the normal key. 802 01:03:48,970 --> 01:03:53,369 It’s bad! So let’s look at the data. 803 01:03:53,369 --> 01:03:56,670 So when we flip one bit in the KeyY we can brute-force all keys 804 01:03:56,670 --> 01:04:01,390 similar to the normal key which is just within a couple of bit flips 805 01:04:01,390 --> 01:04:06,540 and we find that it always results in the normal key 806 01:04:06,540 --> 01:04:12,980 with bits flipped at position either 87 or 88, 807 01:04:12,980 --> 01:04:16,340 sometimes 89, but never 86. 808 01:04:16,340 --> 01:04:22,359 So this reminds me of an adder where you had a carry bit 809 01:04:22,359 --> 01:04:26,160 being propagated to upper bits, but never to lower ones. 810 01:04:26,160 --> 01:04:30,980 So let’s guess that this is an adder and let’s try: 811 01:04:30,980 --> 01:04:37,599 it’s an adder with a rotation so we guess that G(t) = (t+C) 812 01:04:37,599 --> 01:04:45,140 – some constant C, we don’t know it – and rotated to the left by 87. 813 01:04:45,140 --> 01:04:50,680 And then we plug it in to our original formula and we don’t know KeyX, remember, 814 01:04:50,680 --> 01:04:53,640 because it’s set by Bootrom, we don’t have it. 815 01:04:53,640 --> 01:04:59,440 We don’t know the constant C because it’s in silicon, it’s in hardware. 816 01:04:59,440 --> 01:05:04,630 But if we look at the formula, and we consider the inequality, 817 01:05:04,630 --> 01:05:09,440 where we basically rotate right by 87 818 01:05:09,440 --> 01:05:13,500 – we’re basically undoing the outer rotation. 819 01:05:13,500 --> 01:05:18,810 And then we plug in our formula our guess. And then we get this. 820 01:05:18,810 --> 01:05:23,300 And then we subtract C from both sides. We end up with this. 821 01:05:23,300 --> 01:05:28,510 And this is basically… we’re XOR-ing 2 different keys with the same X value 822 01:05:28,510 --> 01:05:34,810 rotated to the left by 2. 823 01:05:34,810 --> 01:05:38,150 Well if you stare for this bit you’ll see that 824 01:05:38,150 --> 01:05:45,950 if y0 and y1 – which are 2 different KeyY’s – are equal except for 825 01:05:45,950 --> 01:05:52,240 at one bit position then the XOR is smallest 826 01:05:52,240 --> 01:05:58,100 for the one which shares the same bit value 827 01:05:58,100 --> 01:06:03,070 at the position that the 2 Y’s are differing at. 828 01:06:03,070 --> 01:06:07,740 It’s actually pretty simple but it sounds difficult. 829 01:06:07,740 --> 01:06:12,720 XOR is Zero if they’re the same input and One if they’re different. 830 01:06:12,720 --> 01:06:16,080 If they’re the same it’s Zero and it’s smaller. 831 01:06:16,080 --> 01:06:20,550 So we actually look bit-by-bit on this. And 832 01:06:20,550 --> 01:06:27,910 we repeat this 128 times. And we recover all 128 bits of the KeyX. 833 01:06:27,910 --> 01:06:32,740 And when we have the KeyX we can calculate the silicon constant C. 834 01:06:32,740 --> 01:06:38,250 So the end result is: the key scrambler is figured out 835 01:06:38,250 --> 01:06:45,290 and we have also the secret Bootrom KeyX for a couple of keyslots, as a bonus. 836 01:06:45,290 --> 01:07:00,780 *applause, motivated by smea* 837 01:07:00,780 --> 01:07:04,530 I didn’t think trough the constants in the slides because I want this to be 838 01:07:04,530 --> 01:07:11,840 an exercise for the listener. 839 01:07:11,840 --> 01:07:16,400 When the new 3DS was released they rushed it, we think, 840 01:07:16,400 --> 01:07:22,440 because they left some interesting commands in the PsPs service. And 841 01:07:22,440 --> 01:07:31,150 it included an early version of the NFC crypto used for the Amiibo figurines. 842 01:07:31,150 --> 01:07:36,609 This implementation, the first one, uses a normal key. And the… 843 01:07:36,609 --> 01:07:40,060 the newer one changed it to KeyY. 844 01:07:40,060 --> 01:07:44,290 So they accidently gave us one of these pairs in the firmware images. 845 01:07:44,290 --> 01:07:47,260 We don’t need to use the Wii U at all. 846 01:07:47,260 --> 01:07:52,210 So anyone who can decrypt 3DS firmware binaries 847 01:07:52,210 --> 01:07:58,400 can perform this attack to get the constants. 848 01:07:58,400 --> 01:08:03,290 So anyone out there: Good luck! 849 01:08:03,290 --> 01:08:06,750 And now: back to smea, for a summary. 850 01:08:06,750 --> 01:08:13,720 *applause* 851 01:08:13,720 --> 01:08:16,880 smea: Right, I’m just gonna conclude really quickly. So, some take-aways of 852 01:08:16,880 --> 01:08:20,839 what we talked about today: first thing is: 853 01:08:20,839 --> 01:08:23,988 it’s all pretty obvious lessons, but – you know – bare with me 854 01:08:23,988 --> 01:08:29,049 Giving access to physical memory to any application, through GPU or whatever, 855 01:08:29,049 --> 01:08:31,849 is dangerous. You should always be careful about that. Even if you think 856 01:08:31,849 --> 01:08:36,059 you’ve protected stuff, there’s probably gonna be stuff that you forgot. So just, 857 01:08:36,059 --> 01:08:39,538 like “you don’t do it or do it right”. 858 01:08:39,538 --> 01:08:42,408 Other thing is: Shared I/O is dangerous if you don’t know 859 01:08:42,408 --> 01:08:47,908 what can actually control the I/O, then, well, again, you should be very careful. 860 01:08:47,908 --> 01:08:52,319 Also, only checking your data before decryption is dangerous, 861 01:08:52,319 --> 01:08:56,429 and - both that and not checking the key when you know that it could possibly 862 01:08:56,429 --> 01:09:00,609 be modified by an attacker is a bad idea. And finally, 863 01:09:00,609 --> 01:09:05,099 secrets in hardware are great unless you give them away, so… 864 01:09:05,099 --> 01:09:07,569 don’t do that! *laughs *audience laughs* 865 01:09:07,569 --> 01:09:11,309 Beyond that we just wanted to talk about the state of Homebrew really quickly. 866 01:09:11,309 --> 01:09:15,488 You might recall, on the - during the Wii U talk around here 867 01:09:15,488 --> 01:09:19,828 2 years ago. And fail0verflow said that they didn’t think necessarily 868 01:09:19,828 --> 01:09:23,599 there was much of a future for console Homebrew. And there’s definitely 869 01:09:23,599 --> 01:09:28,629 an argument for that with the rise of phones, mostly. 870 01:09:28,629 --> 01:09:31,908 Anyone can make an app, can make a game for any number of devices 871 01:09:31,908 --> 01:09:37,189 and sell it to millions of people. But you know, we disagree. 872 01:09:37,189 --> 01:09:39,059 *cheers and applause* 873 01:09:39,059 --> 01:09:43,920 It’s been a year since we started releasing 3DS homebrew. And 874 01:09:43,920 --> 01:09:47,788 – this is supposed to be moving, but… let’s imagine it’s moving. 875 01:09:47,788 --> 01:09:52,489 Well, there in there - like a bunch of 3DS Homebrew. It’s been awesome! 876 01:09:52,489 --> 01:09:56,200 We’ve been working on this really hard. A lot of people had been joining us. 877 01:09:56,200 --> 01:10:01,570 It’s a great community effort. And basically what I want to say is 878 01:10:01,570 --> 01:10:05,860 we want more developers. So if you’d like to join us 879 01:10:05,860 --> 01:10:10,530 there is a very… well it’s not very mature, but it’s maturing, 880 01:10:10,530 --> 01:10:15,130 our SDK. And you know what: reverse-engineering hardware is fun. 881 01:10:15,130 --> 01:10:18,210 When we don’t have any documentation, reverse-engineering software is fun. 882 01:10:18,210 --> 01:10:22,770 We can always use more reverse-engineers and just people who want to make cool shit, 883 01:10:22,770 --> 01:10:28,999 so… Yeah, oh… right! Just one more thing. 884 01:10:28,999 --> 01:10:32,769 Lately there has been a wave of patches by Nintendo, 885 01:10:32,769 --> 01:10:36,170 of known exploits, which has been really annoying. 886 01:10:36,170 --> 01:10:40,479 So for our Browser Hacks, well, yellows8’s Browser Hacks, 887 01:10:40,479 --> 01:10:45,150 menu hacks, stuff like that… Yellows8’s been working pretty hard, 888 01:10:45,150 --> 01:10:49,199 so he actually brought back browser hacks, it should have been released 889 01:10:49,199 --> 01:11:02,720 about 10 minutes ago. *laughter, applause* 890 01:11:02,720 --> 01:11:07,849 But we also had ironhax for an eShop game, a free eShop game, 891 01:11:07,849 --> 01:11:12,479 so you could just download it. That was patched. The thing is, there’s actually 892 01:11:12,479 --> 01:11:16,650 a way to download the old version from the eShop application with some patches. 893 01:11:16,650 --> 01:11:20,269 So we’re also releasing that right now! So basically if you can get Homebrew 894 01:11:20,269 --> 01:11:23,889 and get on to the eShop with a modified patch. 895 01:11:23,889 --> 01:11:27,539 That should also be released in about… well, whenever this is done. 896 01:11:27,539 --> 01:11:31,239 So get it as soon as possible, this is a free game, it will get you 897 01:11:31,239 --> 01:11:36,590 Homebrew forever. So just do that. And also, yellows8 just released 898 01:11:36,590 --> 01:11:39,800 a new version of menuhax which works on latest firmware version. 899 01:11:39,800 --> 01:11:43,499 This was also patched like a couple of weeks or months ago. So, this is all out 900 01:11:43,499 --> 01:11:48,099 right now. If you have a 3DS, get it. If you have friends who have 3DS’s, 901 01:11:48,099 --> 01:11:53,749 well, tell them and tell them to get it. Because it might not last super long. 902 01:11:53,749 --> 01:11:57,950 Yeah, so we would like to thank yellows8 who unfortunately can not be here tonight 903 01:11:57,950 --> 01:12:01,800 but has been super helpful, has been doing a ton of work on the 3DS. 904 01:12:01,800 --> 01:12:05,479 And honestly, a ton of this could not have been done without him. 905 01:12:05,479 --> 01:12:08,639 And thanks to everyone on the #3DSDEV Homebrew channel, 906 01:12:08,639 --> 01:12:11,909 everyone who is attending tonight. Thanks for this. 907 01:12:11,909 --> 01:12:14,999 And if you have any questions, I don’t think we have a lot of time, 908 01:12:14,999 --> 01:12:28,429 but we’ll accommodate. Thanks! *applause* 909 01:12:28,429 --> 01:12:31,740 Herald: Thank you for your patience, if you got questions, please come upfront 910 01:12:31,740 --> 01:12:36,469 to these guys, because we have no more time for structured Q&A. Thank you! 911 01:12:36,469 --> 01:12:41,400 *postroll music* 912 01:12:41,400 --> 01:12:47,499 *Subtitles created by c3subtitles.de in the year 2016. Join and help us!*