Beruflich Dokumente
Kultur Dokumente
Volldampf voraus!
24. Chaos Communication Congress
Tagungsband
Volldampf voraus! 3
Bibliografische Information der Deutschen Nationalbibliothek
Die Deutsche Nationalbibliothek verzeichnet diese Publikation
in der Deutschen Nationalbibliografie; detaillierte bibliografische
Daten sind im Internet über http//dnb.d-nb.de/ abrufbar.
24. Chaos Communication Congress Eine Veranstaltung des Chaos Computer Clubs.
http://events.ccc.de/congress/2007/
Veranstaltungen
Tag 1 ... 174
Tag 2 ... 178
Tag 3 ... 181
Tag 4 ... 185
Volldampf voraus!
24. Chaos Communication Congress
Papers
27. - 30. Dezember 2007, Berlin
8 24C3
24. Chaos Communication Congress
Science
lecture
Tag 2 12:45
Saal 2
de
Anoushirvan Dehghani
Absurde Mathematik
Paradoxa wider die mathematische Intuition
Ein kleiner Streifzug durch die Abgründe der Mathematik. Eigentlich ist der Mensch mit
einer recht gut funktionierenden Intuition ausgerüstet. Dennoch gibt es Paradoxa, welche
mathematisch vollkommen korrekt und beweisbar sind, jedoch unserer Intuition
widersprechen. Der Vortrag bietet einen Streifzug durch einige dieser Paradoxa, die kurz
und anschaulich erklärt werden.
Nicht alles, was mathematisch beweisbar ist, ist auch intuitiv und verständlich zu erfassen. Wie
kann beispielsweise ein einfacher Körper wie Gabriels Horn ein begrenztes Volumen, aber eine
unendlich große Oberfläche haben? Oder warum ist es bei einem Triell, einem Duell mit drei
Schützen, als schlechter Schütze für das eigene Überleben von Vorteil, wenn man als letztes
schießen darf? Woher kommt das Braess'sche Paradoxon, bei dem die Verbesserung eines
Verkehrsstreckenabschnittes zum Zusammenbruch des gesamten Verkehrsflusses führen kann?
Wie kann bei Penney-Ante ein unfaires Spiel entstehen, wo doch eine absolut faire Münze
geworfen wird?Und wie lief das genau mit dem bekannten Ziegenproblem, soll man sich nach
Öffnen der ersten Tür mit der Niete zwischen den anderen beiden Türen umentscheiden?
Volldampf voraus! 9
27. - 30. Dezember 2007, Berlin
Absurde Mathematik
Anoushirvan Dehghani
4. Dezember 2007
Zusammenfassung. Ein kleiner Streifzug durch die etwas Dieser Körper hat also eine unendlich große und dennoch
absurderen und paradoxen Seiten der Mathematik. Es wer- glatte Oberfläche2 , jedoch ein nur endlich großes Volumen!
den Beweise gezeigt, die der menschlichen Intuition oder Anschaulich gesagt: Entspricht eine Maßeinheit 10 cm, so
einfach nur sich selbst widersprechen. Wo es möglich ist, reichen etwas mehr als drei Liter Farbe aus, um das Horn
sollen die Paradoxa auch aufgelöst werden. komplett zu füllen. Jedoch würde sich niemals genug Far-
be finden, um die ∞ qm große Oberfläche anzustreichen -
und dies, obwohl das Horn doch bereits komplett mit Farbe
gefüllt ist!
1 Gabriels Horn
Die Erklärung dieses Paradoxons liegt an den unterschied-
lichen Dimensionen der Oberfläche und des Volumens. Die
Ein seit der Neuzeit bekanntes mathematisches Paradoxon
Integration eines Rotationskörpers kann als stückweise Ad-
ist Gabriels Horn. Nach seinem Entdecker Evangelista Tor-
dition kurzer zweidimensionaler Ring- bzw. dreidimensio-
ricelli 1 wird es auch Toricellis Trompete genannt.
naler Scheibchensegmente angenähert werden. Deren Radi-
Es handelt sich dabei um der in Abb. 1 gezeigten Rotations- us entspricht dabei jeweils dem momentanen Funktionswert
körper, der durch eine Drehung des Graphen von y = x1 für von y = x1 .
alle x ≥ 1 um die x-Achse erzeugt wird.
Werden diese Segmente infinitesimal kurz gehalten, so er-
geben sich eindimensionale Ringstreifen bzw. zweidimen-
1 sionale Kreise. Wächst nun x über alle Grenzen, so gilt:
0
1
1+ π
x4
ï1 2π π für x→∞ (3)
1 x x2
2
3
4
5
6 ï1
7 0
8
Das wachsende x geht also nur reziprok linear in die Größe
9
10 1
10 24C3
24. Chaos Communication Congress
2 Efrons intransitive Würfel rasch langweilig wird, konnte beseitigt werden. Auch mit
nur drei Würfeln läßt sich ein intransiver Satz erstellen. Als
Fazit bleibt: Die Eigenschaft, der wahrscheinliche Gewin-
Der gesunde Menschenverstand sagt: Wenn der Porsche ner eines Matches zu sein, muß nicht transitiv sein! Was bei
meist schneller ist als der Audi, und der Ferrari meist „Stein, Schere, Papier“ willkürlich festgelegt wurde, kann
schneller als der Porsche, so wird der Ferrari in der Re- auch mit solidem Regelwerk begründet werden.
gel auch den Audi schlagen. Der Mathematiker spricht hier
von einem transitiven Vorteil. Dass dies bei einem Glückss-
piel mit fairen Würfeln nicht gelten muß, erscheint absurd -
und dennoch ist es so! 3 Penney-Ante
Die erste Person, die einen Satz solch intransitiver Wür-
fel vorgestellt hat, war Bradley Efron3 . Die Belegung ist in Wo wir gerade bei intransitiven Paradoxa sind: Wie wäre es
Abb. 2 dargestellt. Fair bedeutet, dass jede Seite eines Wür- mit einem einfachen Münzwurf? Die Wahrscheinlichkeit p
fels die gleiche Auftretenswahrscheinlichkeit von p = 16 be- für Zahl, Z, sei dabei genauso hoch wie q, die Wahrschein-
sitzt. Seltsam dabei: Spieler 1 darf sich einen beliebigen die- lichkeit für Kopf K: p = q = 12 . Es soll sich dabei um glei-
chermaßen faire wie gedächtnislose Münzen handeln. Der
Ausgang eines Wurfes ist also nicht von den vorhergehen-
den Würfen beeinflußt.
Die Regeln des Spieles lauten: Spieler 1 sucht sich eine be-
liebige Reihe von Münzwürfen der Mindestlänge drei aus,
Abbildung 2: Efrons Würfel
beispielsweise ZKK oder KKZK. Spieler 2 wählt nun eben-
fals eine Wurfreihe aus. Sodann wird die Münze so lange
ser vier Würfel aussuchen. Spieler 2 kann nun immer einen geworfen, bis die Reihe eines der beiden Spieler auftaucht.
der verbleibenden Würfen so auswählen, dass sein Würfel Wenn Spieler 2 alles richtig anstellt, so wird er immer eine
den von Spieler 1 im statistischen Mittel schlägt. Mathema- Kombination finden, deren Gewinnwahrscheinlichkeit hö-
tisch formuliert gilt: her ist als die von Spieler 1. Für die genannten Beispiele
wären das ZZK und ZKKZ. Wie kann und darf das sein?
P (A > B) = P (B > C) = P (C > D)
Die Wahrscheinlichkeiten sind doch pqq = ppq = 18 bzw.
2 qqpq = pqqp = 16 1
. Oder etwa nicht?
= P (D > A) = (4)
3
Die Taktik, mit der Walter Penney [6] den wahrscheinlichen
Wird der Wettstreit beispielsweise über zehn Runden ge-
Ausgang dieses Spieles zu seinen Gunsten beeinflußt, lautet
spielt, so gewinnt A über B mit an Sicherheit grenzender
wie folgt: hat Spieler 1 die folgende Münzreihe der Länge
Wahrscheinlichkeit. Genauso B über C. Und C über D. Und
n gewählt
D über A - womit das Bild eines Treppenhauses im Stile
4
von Escher vor Augen rückt. m1 m2 m3 . . . mn , (5)
so setzt Spieler 2 auf die Reihe:
Wie kommt dieses Phänomen zustande? Die Betrachtung
der Erwartungswerte, also der statistischen Mittelwerte, m2 m1 m2 . . . mn−1 . (6)
bringt keinen Hinweis: E[A] = 16 6 , E[B] = 3, E[C] = 6 ,
20
E[C] = 3. Aufschlussreicher ist dagegen ein Blick auf die Entscheidend ist hierbei m2 , welches das Gegenteil von m2
bedingten Wahrscheinlichkeiten. Bei diesem direkten Ver- darstellt: K anstatt Z und Z anstatt K. Spieler 2 wählt also für
gleich zeigt sich, dass die Abstufungen der Würfel genau seine letzten n − 1 Plätze genau die Werte, die Spieler 1 auf
so gewählt sind, dass sie jeweils ihren „Vorgänger“ gerade den ersten n−1 Plätzen hat. Der erste Wert von Spieler 2 ist
eben mit p = 23 schlagen - unter minimalem Einsatz der die Negation des zweiten Wertes von Spieler 1: K anstatt Z
Mittel, also der Augen auf den Seiten. Anders formuliert: bzw. Z anstatt K, wie auch in den oben genannten Beispielen
Jeder Würfel ist genau so „eingestimmt“, dass er im Ver- geschehen.
gleich zu seinem unterlegenen Widerpart in 24 von 36 Fäl-
len überlegen ist. Die dazu verwendeten Ziffern sind dabei Zum Verständnis dieses Sachverhaltes ist ein Zustandsdia-
so gewählt, dass sich der genannte „Kreislauf“ bilden kann gramm wie in Abb. 3 hilfreich. Spieler 1 setzt hier auf ZKK,
- und damit zu jedem Würfel ein überlegener existiert. Spieler 2 auf ZZK. Die Übergänge entsprechen jeweils dem
Ausgang eines Münzwurfes, K oder Z. Wir beginnen im lin-
Mittlerweile gibt es eine Reihe weitere Sätze intransitiver ken Zustand „Start“. Sobald das erste mal ein Z landet, ent-
Würfel. Der Schönheitsfehler von Würfel B, dessen Wurf spricht das der Initialisierung beider Reihen (die jeweils mit
3 * Mai 1938 in Minnesota, USA. Z beginnen), und der Zustand A wird erreicht. Je nach dem
4 Nach Maurits Cornelis Escher, * 17. Juni 1898 in Leeuwarden; NL; † weiteren Verlauf der Münzwürfe wird früher oder später das
27. März 1972 in Hilversum, NL. Gewinnfeld für Spieler 1 oder Spieler 2 erreicht.
Volldampf voraus! 11
27. - 30. Dezember 2007, Berlin
4 Das Ziegenproblem
12 24C3
24. Chaos Communication Congress
Volldampf voraus! 13
27. - 30. Dezember 2007, Berlin
14 24C3
24. Chaos Communication Congress
Hacking
lecture
Tag 1 17:15
Saal 2
en
Victor Muñoz
In this lecture we will see how easy could be retrieve AES keys attacking the implementations,
when you have physical access to the box that tries to hide a key you can easily spot it, such
kind of security could be just named obfuscation but is widely used in DRM technologies like
AACS.This is just a demonstration that using a strong security algorithm like AES is not of much
sense when give the key somehow obfuscate to the attacker, remember that the security chain is
as strong as the weakest of their components.
Volldampf voraus! 15
27. - 30. Dezember 2007, Berlin
October 2007
16 24C3
24. Chaos Communication Congress
Volldampf voraus! 17
27. - 30. Dezember 2007, Berlin
the AACS scheme, all those with the state of art side-
attacks have something in channel attacks on AES.
common: AES key spotting with
a little of effort in comparison
Reference
[2]
http://www.iaik.tugraz.at/aboutus/people/oswald/papers/aes_report.
pdf - AES - The State of the Art of Rijndael’s Security
[x]
[4] http://forum.doom9.org/showthread.php?t=119871
18 24C3
24. Chaos Communication Congress
Science
lecture
2007-12-29 16:00
Saal 2
en
Tomasz Rybak
Presentation shows simple statistics of Sputnik data. The main part is description of ways of
generating sequences of packets generated by tags. Two methods, local ang global are
described, with few variants. Problems with using those methods are presented.
Volldampf voraus! 19
27. - 30. Dezember 2007, Berlin
This article describes attempts to analyse data coming from Sputnik project gathered during 23rd
Chaos Communication Congress. The most significant problem was recovering lost sequence identifiers,
and this is main subject of article.
1 Sputnik idea
Sputnik is RFID system intended to trace people in small areas, and buildings. Each person is
wearing tag that transmits its identifier in regular time intervals to allow to store this persons position
at those moments. System was used during previous, 23rd Congress, and during Chaos Communication
Camp 2007. Data from Camp has not yet been released, and this article describes analysis performed
on data from 23C3.
After releasing data there were few web pages created describing system and data, and trying to
analyse it. The main page of project1 is maintained by creators of Sputnik system. Wiki of OpenBeacon
contains page2 with discussion about released data. Peter Meerwald came with page3 presenting come
analysis of gathered data. Kaners page4 contains parser of log files, allowing to get information about
only particular ID. My page5 contains programs and results described in this article.
2 Hardware
Ordinary RFID systems are operating in range of few dozens kHz, and use passive tags. Tag does
not contain any power source; it is powered by reader during reading process. So without reader it
can do nothing. Sputnik uses active tags; they have own battery and transmit data whatever there is
reader listening to it or not. Using own battery allows for having high power and thus high range of
transmission. Range in buildings is up to the 10m even through dry walls. Concrete walls tend to block
signal. Because transmission occurs at 2.4GHz, human body decreases power by about 50%.
Thanks to own battery tag has control over transmission power and can send signals varying in
strength. This allows for estimating distance from reader. During 23C3 25 readers were placed in BCC
in such a way that in most cases more than one reader saw tag. This, because of possibility of estimating
distance from reader, allows for estimation of position of tag.
First readers were large boxes using Power Ethernet to communicate with the server and to power
themselves. During Camp Milosz Meriac presented USB reader6 , small device, powered and transmitting
data using USB. It acts like terminal, sending data in text format; computer can receive read packets, and
send commands to it. Additionally it can also serve as tag, as it have full transmitter on board. Because it
is more sophisticated than tag, user has more control over sent RFID packets. It creates /dev/ttyACM*
device and sends text in either “ID,Sequence,strength,flags” or “RX: ID,strength,number” format, de-
pending on version of firmware. It can be reprogrammed directly using USB, without any additional
hardware.
1 http://www.openbeacon.org/
2 http://wiki.openbeacon.org/wiki/Datamining
3 http://pmeerw.net/23C3 Sputnik/
4 http://cakelab.org/ kaner/sputnik 01/
5 http://www.bogomips.w.tkb.pl/sputnik.html
6 http://wiki.openbeacon.org/wiki/OpenBeacon USB
20 24C3
24. Chaos Communication Congress
3 Data format
Data gathered during 23C3 was made available as both XML and binary files.
XML file
Consisted of “observation” tags with following attributes:
id ID of tag
time
position position of tag; (0, 0, 0) if unknown
direction always (0, 0, 0)
priority always the same value 24
XML file contains very small portion of data that was gathered during 23C3. It has only 357974
entries, where full data set is 11.1 million of observations. It does not contain details of readers used
to calculate positions of tags. This omission is important, as about 1/3rd of observations has no mean-
ingful position calculated, probably because in those cases there was not enough data to calculate those
positions. Also XML file contains data from only few hours for each day of Congress; probably those are
hours when server was active. Number of observations during the Congress stored in XML file is shown
in Figure 3.
Because of having no sequence numbers and reading stations used to calculate positions, I did not
use XML data in analysis.
Data from binary file was more useful for analysis, although it contained errors. Because of error in
server software, identifiers of tags were not saved.
0-4 timestamp
5-8 reader station IP
12 strength of signal
12-16 sequence number
17-20 Tag ID
21-24 check sum
Volldampf voraus! 21
27. - 30. Dezember 2007, Berlin
0-4 timestamp
5-8 reader station IP
9-12 garbage (used by me to write ID)
Missing identifiers made analysis almost impossible. Additional problem were 8 bytes in one of files;
information published on OpenBeacon mailing list allowed me to removed those unnecessary bytes and
to have full data set. Binary data set had 64K repeated readings — observations that were the same as
other observations.
4 Database
Data set so large takes long time to read and parse it. I decided to store it in PostgreSQL database.
In the beginning both XML and binary sets were stored in one table, but then it was divided into two
tables; then more support tables were added; PostgreSQL table inheritance was used to ease operating
on main data tables7 .
Created database can be seen as temporal, and when looking at XML data also as spatial one. Such
databases store information about presence of phenomenas in space and time. This database stores
information about presence of tags (and probably persons wearing them) at the place at the moment.
Also activities done to this tags, like pressing button, are stored. Additional spatial data, like geometry
of building and rooms where events were held, and temporal data (schedule of Congress) can be used for
more sophisticated analysis.
Created tables
22 24C3
24. Chaos Communication Congress
id
time
sequence value of sequence counter
observedobject
priority
mindistance
maxdistance
Table of rooms
Describes room in which events (lectures) were taking places.
id identifier of room
name name of room: “Saal 1”, “Shelter foo”, . . .
shape path describing room shape. Currently empty column; data to fill it could be taken from GPS
data or from building plans
ymin
ymax
bbox Is it necessary, or better use geometry calculations or PostGIS?
Event table
Describes information about events. Populated using XML schedules published on
http://www.ccc.de/
id identifier of event
organizerid
name name of event
place identifier of room event is taking place
description human-readable description
address URL of description of event
Volldampf voraus! 23
27. - 30. Dezember 2007, Berlin
5 Analysis of data
To understand further operations, one needs to understand how internally tags work. In each trans-
mission tag sends its ID and strength of signal it uses to transmit. Each transmission is encrypted using
XXTEA. To avoid replay attacks, it is necessary to change packets. Because adding real time clock would
be too complicated, ever-increasing counter was added. Base station discards all packages with counter
numbers less that the one seen previously. To avoid problems with reset of tag (removing battery) when
counter is again set to 0, counter was divided. Higher word was saved on reset, and lower not. So after
reset tag increases higher word, so counter value always grows. This feature means that gaps occur in
counter values sequences when tag is reset. To avoid collisions, each tag transmits and sleeps random
time, from 2 to 4 seconds.
24 24C3
24. Chaos Communication Congress
Figure 3: Number of packets read during one minute including unknown points
Volldampf voraus! 25
27. - 30. Dezember 2007, Berlin
Id IP address count
2 10.254.2.3 1322696
21 10.254.5.21 880833
3 10.254.2.12 760606
15 10.254.1.6 758782
18 10.254.5.2 596466
14 10.254.4.12 589640
20 10.254.8.14 585443
26 10.254.1.16 570525
5 10.254.1.7 568765
4 10.254.2.10 563488
1 10.254.4.6 542657
16 10.254.1.12 532699
22 10.254.4.11 528187
11 10.254.1.22 494524
10 10.254.1.5 448760
9 10.254.2.5 428565
8 10.254.3.9 376396
24 10.254.3.5 231483
23 10.254.7.14 225075
17 10.254.0.254 187078
6 10.254.3.13 130379
13 10.254.0.7 129144
12 10.254.3.21 54863
25 10.254.0.100 8524
Strength of packets
Strength count
0 182874
85 568413
170 1167287
255 9225658
26 24C3
24. Chaos Communication Congress
for j in data:
e, s = int(j[1], int(j[1])
major = s/65536
minor = s%65536
probable = (major == old_major and minor == old_minor+1)
or (major == old_major+1 and minor == 0)
if probable: p.append([e, s])
if len(p) > 0:
print old_e, old_s,
for j in p: print j[0], j[1],
Basic idea of algorithm for searching local sequences is enhancements of code above. It takes all
points from choosen period of few dozens seconds. To find all sequences of ticks there it assumes that
ticks are about 1.5s from one another. Starting from the lowest counter value it tries to find the next
value. In case of very close values of counter, difference of time is 1 or 2 seconds. In case of longer time
distances, difference should be closer to 1.5s for every tick. It ignores data about strength of signal or
stations that were able to receive it.
When more than one packet can be chosen to extend sequence conflict occurs, and this problem must
be resolved. Conflict may be because either at the same time there are two different counter values, or
the same value occurs at different moments. In case of either conflict we must choose only one packet to
include in sequence, and discard another one. It needs to be noticed that not only two, but more packets
may be involved in conflict. The general case is presence of more than one sub-sequence that can extend
existing sequence. Only one of them must be chosen, as adding all sub-sequences will destroy existing
sequence by introducing decreases in either time or counter values.
Sub-sequence may be chosen by taking into consideration length or resemblance to already existing
sequence. Using separate function for choosing sequence to add allows for researching on different criteria
of choosing and introducing more sophisticated criteria.
Alternative solution is creation of function returning next values of time and counter, basing on
sequence that is being rebuilt. This is more complicated, as it requires knowing exact parameters of tag,
especially time when it was started or reset, and exact time tag sleeps between transmissions.
Function GetTickDistance returns difference between counter values. It tries to take reset into
consideration by treating reset as difference of 1. It decides that reset occurred when values passed as
arguments have differing high words. However if there is less than about one minute to change of high
word, it does not assume reset was involved.
Distance between sequence values
# Assumes a <= b
# Will not work when there is more than 1 overflow
def GetTickDistance(a, b):
majora = a/65536
minora = a%65536
majorb = b/65536
minorb = b%65536
# Inside one minor, or less than minute to overflow
if majora >= majorb or minora >= 65500:
return b-a
else:
return majorb-majora + minorb+1
To be able to recreate sequences it is necessary to create all alternatives and then choose the best
ones. Hashes are used to store all counter values that were received at any moment, and all moments
when any value of counter was received. All keys of hashes are read in increasing order, and all values
stored under every key are considered as extensions of sequences. If considered point can be added to
sequence, it is. If not, conflict is detected. Previous value is removed from sequence, and both points are
added to special list of alternatives. In such case each subsequent point is treated as extension not of
main sequence, but alternative sub-sequences. If it can be added to all of them, alternatives are stored,
Volldampf voraus! 27
27. - 30. Dezember 2007, Berlin
and this point is added to main sequence. If it can be added to only some of sub-sequences, conflict still
remains. If it cannot be added to any of sub-sequences, it is added as another alternative sub-sequence.
Function FindBestSequence takes sequence and all alternative sub-sequences calculated by previous
function and builds optimal sequence. It chooses the best possible sub-sequences to add. To choose the
best ones it uses slope of sub-sequences, and chooses one with the slope closest to 1.5. Minimal square
difference is used to find slope closest to ideal.
Finding best sequences amongst all created
Described algorithm can be implemented in two ways. Main loop may iterate over time and check
all possible counter values, or it can iterate over counter values and check all moments of appearance of
this value. Those approaches should be equivalent, but iterating over counter values gives as result more
and longer sequences. If using more CPU time is not a problem, both variants can be used and the best
results given by any of them are chosen, independently for each considered interval.
First code that was used to use large part of data was implementation of O(N 3 ) algorithm. For each
point it was finding whether any of other points can be added to the sequence by checking if equation
Δs = aΔt, 1.0 ≤ a ≤ 2.0 was met. After finding all possible points it was generating all possible
alternatives from this chosen set. As it was checking all other points for every point from given interval,
this operation was O(N 2 ). If any sequence was found, it was removed from data set, and entire process
was started from the beginning, thus O(N 3 ) time cost.
O(N 3 ) algorithm
28 24C3
24. Chaos Communication Congress
dt = minort-majort
ds = GetTickDistance(majors, minors)
if dt > 0 and ds <= dt and dt <= 2*ds:
p.append([minort, minors])
if len(p) > 1:
again = True
r = CreateAllSequencesSeqs(p)
s = FindBestSequence(r)
a += 1
if len(s) > b: b = len(s)
break
if again:
for i in s:
UPDATE sputnik.sputnik SET id = %s
WHERE sequence = %s AND time = to_timestamp(%s)
for j in data:
if i[0] == j[1] and i[1] == j[2]:
data.remove(j)
break
id += 1
Improving speed of this algorithm came from observation that the longest sequences are be made
when starting from the lowest time and lowest counter values. Query was changed to return sorted
result. Algorithm was changed to take first tuple, and try to find all other tuples that can make sequence
with the first one. If sequence was found, it was removed from data set; if not, only the first tuple was
removed. So for each tuple all other tuples were considered, which gives O(N 2 ). Because there is no
repetition of this process if sequence is found, but further tuples are processed, this cost remains.
This algorithm gives the same results as previous one; this was proved by comparing sequences
generated by both for few intervals. Cost of those algorithms can be slightly higher than O(N 3 ) and
O(N 2 ) when considering building and comparing alternative sub-sequences. However size of such sub-
sequences is small when compared to main sequences. Also size of sub-sequences tend to remain constant
even when increasing length of analysed interval, which increases size of generated sequences.
O(N 2 ) algorithm
10
Volldampf voraus! 29
27. - 30. Dezember 2007, Berlin
Function JoinIDs computes all sequences for one interval and interval after that, and then tries to
join found sequences. For each sequence in main interval it calculates coefficient of line created by its
last point and by first point of sequence from the next interval. If any line with coefficient between 1.0
and 2.0 is found it means that those sequences are candidates for joining. However they would also have
to have the same coefficients themselves before they could be joined.
Function trying to join found sequences
I think it could be even possible to improve local algorithm to have O(N ) time cost. However it was
not implemented so I do not know if it is really possible and if it would give good results.
Function calculating distance in counter values was changed, as it was producing strange sequences
(65600, 132000, 512000, . . . ). Reset was ignored, and distance was ordinary difference of counter values.
However this was not helpful. Local algorithms were not able to find long enough sequences. Although few
found sequences were rather long (up to 20 packets for 1 minute), but most found were only consisting of
2 or 3 packets. This was leading to large gaps between sequences from consecutive intervals, and troubles
with joining them.
New distance in sequence counter function
# Assumes a <= b
# Will not work when there is more than 1 overflow
def GetTickDistance(a, b):
majora = a/65536
minora = a%65536
majorb = b/65536
minorb = b%65536
return b-a
Scatter plots drawn for long intervals are revealing straight lines. This lead to the idea to find straight
lines (as drawn in geometry) and to treat them as sequences. To avoid problems with reset calculations
were done inside 64k blocks of counter values.
The best way to find the longest sequences is to start with point with the lowest values of counter
and time. Then try to draw lines through it and all other points from the range. Choosing slope that
11
30 24C3
24. Chaos Communication Congress
results in line going through the most points gives the longest sequence. This is greedy algorithm as in
each step the largest sequence is chosen.
To choose the best line coefficient histogram of all slopes is used, with bucket of size 0.1. To be sure
that no point is left because of rounding errors, range of slopes is used: all points that are on lines with
slopes differing less than ±0.3 from chosen slope are included into created sequence.
Because for each point all other points are used to calculate slopes and then all points that are in
right coefficient range are chosen, time cost is O(N 2 ).
It finds long sequences. It leaves only about 4000 points (out of 11.1 million) without any sequence.
However rather strange line coefficients are found; besides ordinary 2.4, 2.5, it comes with 0.1, 0.4, 0.5,
9.9, 10.0, 8.1, . . .
Function FindIDs takes range of counter values and tries to find all sequences in this range. It finds
all counter values and for each value finds all times it occurs; this is similar to hashes used in local
algorithms. Then for each starting point histogram of all coefficients of lines is created and the largest
value is used. Query similar to one calculating slopes is used to mark all points as belonging to one
sequence. Update is done by one SQL query.
Finding sequences in global manner
Following code shows calling of function for creating sequences. First the lowest unused value for
identified sequence is found, and then function FindIDs is called for each of the values of high word of
tag counter. First range was divided into time intervals so program operates on smaller data sets, but
because of error in code time interval was not respected and first call calculated all sequences from entire
range.
Calling a sequence finder
12
Volldampf voraus! 31
27. - 30. Dezember 2007, Berlin
Figures 4 to 9 show sequences generated by this algorithm. Some sequences are the proper ones, but
other are wrong; their points really belong to many different sequences.
Figures 5 and 6 show sequences that from the beginning look like collage of many sequences. They
show the main problem of algorithm: range of allowed coefficients is too wide, and too many points are
added to sequence. The farther away from the first point, the more obvious it is.
Figure 7 shows sequence that in the beginning is correct, and gets wrong only in the end. So first
part should be preserved, and after it, somewhere is this gap, sequence should end.
Figure 8 shows sequence that is generated by all variants of global algorithm.
Sequence shown in Figure 9 shows errors that came from integer overflow. Because initially I did not
use Python large integers, counter values close to 4 billions were treated as small negative values, and
joined with real small values. Column storing counter values was using 64-bit integers, so PostgreSQL
was able to update rows with large counter values, and not destroy other sequences.
Figure 10 shows packets that were not used in any sequence. It was only about 4000 points, and it’s
very good result for data set consisting of 11.1 million of points.
Figure 11 shows size of generated sequences calculated as number of occurrences of pair (time, counter
value); event if packet was seen by more than one reader, it was counted only once. In other words it
shows number of occurrences of tag, not how many times it was seen.
Figure 12 shows size of sequences calculated as number of tuples that are included into each sequence.
13
32 24C3
24. Chaos Communication Congress
14
Volldampf voraus! 33
27. - 30. Dezember 2007, Berlin
15
34 24C3
24. Chaos Communication Congress
Figure 11: Histogram of sizes of generated sequences for the first set
Figure 12: Histogram of sizes of generated sequences for the first set
Program was running for about 72h on AMD Duron 1.3GHz with 768MB RAM and single HDD IDE
7200RPM. It was IO-constraint, probably because of database size larger than available RAM; CPU
was not much used. Clustering data table according to counter values could improve performance in
the beginning. However PostgreSQL does not try to preserve clustering, so after adding many points to
sequences clustering would be lost and Input/Output capacity would again become limiting factor. Also
PostgreSQL decides to scan entire table if there is more than 5% rows in result, so in this algorithm
entire data table is read.
The main problem with algorithm are sequences that contain point that should belong to many
different sequences. This is caused by too wide range of possible coefficient values. The more distant
from the initial point, the more visible the problem is.
Figure 13 shows histogram of line coefficients for buckets of size of 0.1. Figure 14 shows histogram
of line coefficients for buckets of size of 0.001. As can be seen, first histogram presents false situation;
number of points in many lines that consist of small number of points but have close coefficient values
is able to outnumber one line with high number of points. So in this situation instead of long one line
short one is chosen, and all its neighbours that were able to outnumber the long ones are joined to this
improper sequence.
Improvements of algorithm were necessary to get better results. First was refactoring of code; most
of activities were moved into functions. Second improvement was creation of SQL aggregate function to
choose only one counter value at any given time. This function was used together with grouping with
respect to time, and chosen point was the closest one to the chosen slope. To avoid problems with many
16
Volldampf voraus! 35
27. - 30. Dezember 2007, Berlin
lines joining into one width of histogram buckets was changed to 0.001. Histogram was calculated for
slopes from range 1.0 to 5.0. Additionally range of allowed coefficients was changed from ±0.3 to ±0.001.
However this caused gap at the beginning of each sequence; because of rounding errors in the first few
minutes slope was not close enough to the ideal to be included in chosen range of slopes.
Function sputnik guessbest is SQL aggregate used to choose one point in case of presence of more
than one counter value at the same time. It requires grouping by time in SQL query. It chooses point
which distance from the chosen slope is the smallest. To be able to calculate distance from this line
it needs to know parameters of line; before using this aggregate function sputnik guessinit must
be called. Initialisation function must be called before every query using sputnik guessbest. Both
functions are written in pl/Python and use global hash for PostgreSQL Python functions to store line
parameters and the best found point.
Currently PostgreSQL in Debian does not offer trusted pl/Python, so untrusted pl/PythonU is used.
Creation of functions in untrusted languages requires administrative access to database (usually user
“postgres”) and SECURITY DEFINER during creation to allow ordinary used to use it.
Grouping function
CREATE OR REPLACE FUNCTION sputnik.guessinit(t TIMESTAMP WITH TIME ZONE, sequence BIGINT, slope DOUBLE PRECISION
RETURNS VOID
VOLATILE RETURNS NULL ON NULL INPUT SECURITY DEFINER
LANGUAGE ’plpythonu’ AS
17
36 24C3
24. Chaos Communication Congress
$$
GD["time"] = t
GD["sequence"] = sequence
GD["slope"] = slope
$$;
CREATE OR REPLACE FUNCTION sputnik.guessbest(state BIGINT, t TIMESTAMP WITH TIME ZONE, sequence BIGINT)
RETURNS BIGINT
VOLATILE CALLED ON NULL INPUT SECURITY DEFINER
LANGUAGE ’plpythonu’ AS
$$
if (not GD.has_key("time")) or (not GD.has_key("sequence")) or (not GD.has_key("slope")):
return None
if (t is None) or (sequence is None):
return None
plan = plpy.prepare("""
SELECT (extract(’epoch’ FROM ($1::TIMESTAMP WITH TIME ZONE-$2::TIMESTAMP WITH TIME ZONE)))::float/($3::BIGINT-$4
""", ["timestamptz", "timestamptz", "int8", "int8"])
result = sequence
if state is not None:
r0 = plpy.execute(plan, [t, GD["time"], sequence, GD["sequence"]], 1)
r1 = plpy.execute(plan, [t, GD["time"], state, GD["sequence"]], 1)
if abs(r0[0]["slope"]-GD["slope"]) >= abs(r1[0]["slope"]-GD["slope"]):
result = sequence
else:
result = state
return result
$$;
Function Histogram calculates histogram of slopes of all lines going through given point. If there
is more than one slope with the same maximal number of points, the smallest one is chosen. Function
returns slope and number of points in bucket. If it is unable to calculate any slope it returns pair 0, 0.
Histogram function
def Histogram(c, time, sequence, sa, sz):
hash = {}
c.execute("""SELECT DISTINCT ON (time, sequence) time, sequence,
(extract(’epoch’ FROM (time-%s::TIMESTAMP WITH TIME ZONE)))::float/(sequence-%s::BIGINT)::float
FROM sputnik.sputnik WHERE id IS NULL AND
sequence BETWEEN %s::BIGINT AND %s::BIGINT AND
time > %s::TIMESTAMP WITH TIME ZONE AND
sequence > %s::BIGINT""", (time, sequence, sa, sz, time, sequence))
i = c.fetchone()
while i != None:
k = int(i[2]*1000)
if 1000 <= k and k <= 5000:
hash[k] = hash.get(k, 0)+1
i = c.fetchone()
if len(hash) > 0:
m = max(hash.values())
for i in xrange(1000, 5001):
# Let’s take the smallest max
if m == hash.get(i, 0):
18
Volldampf voraus! 37
27. - 30. Dezember 2007, Berlin
result = float(i)/1000.0
break
return result, m
else:
return 0.0, 0
Function Line takes as parameters starting point of line, slope of line and allowed range of slopes
and finds all points that lie on that line. It initialises global Python hash, as main query uses aggregate
sputnik guessbest. It retrieves all matching points from database and returns list holding them.
Function finding points on line with given slope
Function FindIDs iterates through all values of counter inside given range, and finds all times when
any counter had particular value. Each such pair is treated as potential starting point of line; histogram
of slopes is calculated, and if returned bucked holds more than 8 points, new sequence is created. Unlike
previous version, this function does not use one update query, but every point is updated by separate
SQL command.
Function finding all lines
19
38 24C3
24. Chaos Communication Congress
20
Volldampf voraus! 39
27. - 30. Dezember 2007, Berlin
21
40 24C3
24. Chaos Communication Congress
Figure 20: Histogram of sizes of generated sequences for the second set
Figure 21: Histogram of sizes of generated sequences for the second set
22
Volldampf voraus! 41
27. - 30. Dezember 2007, Berlin
Figure 23 shows three distinct sequences joined into one. They have similar slope and their points lie
in allowed range, so they are joined together, even though that points should create distinct sequences.
Figure 24 shows three sequences that have different slopes, but are also joined. This situation can
be detected by calculating difference of slopes between consecutive points, similarly to differentiating.
The long sequence of differences of the same sign may mean followed by long sequence of differences of
another sign suggests join of different sequences.
Figure 25 shows sequence that have points not placed directly on ideal line. It may seem similar
to previous situation, but (especially if differences between points and slopes are not large) it is single
sequence. The main difference between situation in figures 24 and 25 is number of points that have the
same sign of difference between slopes and absolute difference between those slopes. If both of those
parameters are small, there is single sequence.
New firmware of tags was released during CCC2007. Transmission was not occurring every few
seconds, but about 10 times a second. This, together with USB reader, allowed for analysing if discarding
sub-second parts introduces large error in scope of lines. I took few minutes of readings, and calculated
two slopes, one taking all data into consideration, and another using floor function to discard milliseconds.
Resulted slopes differed on 4th place after comma, so having only seconds when transmission occurred
does not result in error disallowing operating on data.
Either having too wide range and having joined sequences, or having too narrow range and leaving
some points out, without guarantee that appropriate points are included in sequence meant need for
including additional data in searching for good sequences. First of additional variables that could point
whether to include tuple into the sequence was signal strength. Each tag changes strength of sent signal,
either in sequence of 0x00, 0xff, 0x55, 0xff, 0xaa, 0xff, 0xff, 0xff, or in 0x00, 0x55, 0xaa, 0xff, depending
on used firmware version.
First problem would be that in old firmware 5 out of 8 values was 0xff, so it would be difficult to
determine where in sequence of signal strengths particular point is. However analysing of source code
23
42 24C3
24. Chaos Communication Congress
and Sputnik data revealed that strength of signal was not distinctive between tags. Each tag starts at
the same strength sequence point, so there is no variability between sequences. If more than one point
has the same counter value, they also have the same strength of signal. It can not be used to distinguish
different sequences.
As mentioned earlier, because of rounding errors at the beginning of sequence coefficients do not
have the same values as coefficients for further points. It is necessary to have wider allowed range of
slopes in the beginning and more narrow near the end. This can be accomplished by sigmoid function8 .
0.09
Function 0.01 + 1+e(x−500)/100 was used in program. At the distance 0 it generated border of 0.1; its value
was getting smaller to reach 0.01 for argument of 1000. Because of very large exponential values, FPU
exception was generated for arguments greater than about 70000.
Because strength of signal could not be used, stations that received signal from tag were used. The
main assumption was that set of seen stations did not change from one point to another if that points were
close in time. To keep algorithm simple only list of seen stations was considered, not their distribution
in space. Similarity was defined as number of stations in both sets, divided by size of joined sets.
If strengths of signals in both points differ similarity function was slightly changed, and returned
number of stations seen using weaker signal divided by number of stations seen with stronger signal. But
because most of points in data set had the strongest value of signal, there was not many situations with
different signals between points.
To avoid errors shown in Figures 22, 23, and 24, algorithm was changed to retrieve all potential points
that could be added to generated sequence and choose the best one itself. This approach is return to the
idea of generating alternative sub-sequences used in local algorithm.
Points that are in conflict have condition ¬(T1 > T0 ∧ S1 > S0 ) met. Program creates all possible
sub-sequence from them and then chooses the best one. To choose the best it locally compares lengths,
slopes of sub-sequences and reading stations seen by all sub-sequences and chooses one that is the most
similar to main sequence.
Last version of algorithm differs from previous ones, and those changes can be summarised in “take
more points and choose the best ones”. Instead of using constant range, sigmoid function was used to
include more points in the beginning of sequence. All points are read from database, and program builds
alternative sequences from them. Instead of using custom aggregate function to choose only one point,
standard function aggregating all seen stations into array is used. This array is then used to choose the
best points to include into sequence. The last change is breaking line if it is discovered that created line
has high probability of being two different lines.
Function Similarity returns number from range < 0.0; 1.0 >. This is degree of similarity of two sets
of readers that were able to receive signal from tag. Function uses sets introduced in Python 2.4.
Similarity of seen stations
24
Volldampf voraus! 43
27. - 30. Dezember 2007, Berlin
station1, strength1 = b
size0, size1 = len(station0), len(station1)
if strength0[0] > strength1[0]:
same = 0.0
for i in station1:
if i in station0: same += 1
result = same/len(station1)
elif strength0[0] < strength1[0]:
same = 0.0
for i in station0:
if i in station1: same += 1
result = same/len(station0)
else:
result = float(len(set(station0)&set(station1)))/
float(len(set(station0)|set(station1)))
return result
Function Fetch reads all points from database that can be used to create sequence. It takes all
packets that were received less than two minutes after first point of sequence, and then returns those
which slope lies in range determined by sigmoid function.
Getting all points that can create line
25
44 24C3
24. Chaos Communication Congress
Function Lines takes list of all points that were read from database and creates all possible sequences
from them. It is similar to function used in local algorithm.
Calculating all possible sequences from points
def Lines(data):
result = [] candidate = []
for i in data:
num = 0
for j in candidate:
if i[0] > j[-1][0] and i[1] > j[-1][1]:
num += 1
if len(candidate) == num:
if len(candidate) == 1: result.extend(candidate[0])
elif len(candidate) > 1: result.append(candidate)
candidate = [[i]]
else:
for j in candidate:
if i[0] > j[-1][0] and i[1] > j[-1][1]:
j.append(i)
if 0 == num: candidate.append([i])
# Add last alternative
if len(candidate) == 1: result.extend(candidate[0])
elif len(candidate) > 1: result.append(candidate)
return result
Function Line takes all sub-sequences and chooses the best line from all given alternatives. Each
of alternatives has calculated up to five factors that are taken into consideration: length, similarity of
slopes in the beginning and in the end, similarity of seen stations in the beginning and in the end. Only
the best sub-sequence gets points for each factor, and then only the best one is chosen. If there is more
than one best alternative, the first one is chosen.
The very important part of this function if condition j[0][0] > result[−1][0] . . . which allows only sub-
sequences which time and counter values are greater than already existing in sequence to be considered
as alternatives. This protects from the problem of having improper sequence in case when one alternative
choosing after another.
Choosing the best line from all alternatives
def Line(lines):
result = []
for i in xrange(len(lines)):
if type(lines[i][0]) != type([]): result.append(lines[i])
else: alternatives = []
if len(result) > 0:
for j in lines[i]:
if j[0][0] > result[-1][0] and j[0][1] > result[-1][1]: alternatives.append(j)
else: alternatives = lines[i]
scores = [0] * len(alternatives)
sizes = map(lambda x: len(x), alternatives)
best = max(sizes)
for j in xrange(len(alternatives)):
if sizes[j] == best: scores[j] += 1
stationsa = map(lambda x: Similarity((result[-1][4], result[-1][5]), (x[0][4], x[0][5])), alternativ
# Find best alternative for stations in the beginning
if i+1 < len(lines) and type(lines[i+1][0]) != type([]):
stationsz = map(lambda x: Similarity((x[-1][4], x[-1][5]), (lines[i+1][4], lines[i+1][5])), alte
# Find best alternative for stations in the end
slopesa = map(lambda x: abs(alternatives[x][0][3]-result[-1][3]), xrange(len(alternatives)))
# Find best alternative for slopes in the beginning
if i+1 < len(lines) and type(lines[i+1][0]) != type([]):
slopesz = map(lambda x: abs(alternatives[x][0][3]-lines[i+1][3]), xrange(len(alternatives)))
26
Volldampf voraus! 45
27. - 30. Dezember 2007, Berlin
Function Break takes four consecutive points a, b, c, and d and returns number from range <
0.0; 1.0 >, the probability that line should be broken between points b and c, because they belong to
different lines. It takes six factors into consideration: difference in slopes between lines a-b and b-c, and
b-c and c-d, difference in time between following points, similarity of seen stations between points b and
c, and absolute changes of slope between local and global value.
Function returning probability of break
Main function FindIDs calls all previous functions and generates sequence. It decides to break line
if probability returned by function Break is more than 0.5, in such case of iteration of loop creates more
than one sequence.
Function creating all lines
27
46 24C3
24. Chaos Communication Congress
t0 = t[0]
slope, count = Histogram(c, t0, s0, sa, sz)
if slope > 0.0 and count >= 8:
data = Fetch(c, t0, s0, slope, sa, sz)
lines = Lines(data)
line = Line(lines)
for i in xrange(len(line)):
skip = False
if len(line[i][4]) != len(line[i][5]):
print "Error in size of ", line[i]
skip = True
s = line[i][5][0]
for j in line[i][5]:
if j != s:
print "Error in strength of ", line[i]
skip = True
if skip:
break
UPDATE sputnik.sputnik SET id = %s WHERE id IS NULL AND
time = %s::TIMESTAMP WITH TIME ZONE AND sequence = %s::BIGINT
if i > 0 and i < len(line)-2:
b = Break(line[i-1], line[i], line[i+1], line[i+2], slope)
if b > 0.5:
id += 1
print "Break here, new id ", id, b
id += 1
return id
28
Volldampf voraus! 47
27. - 30. Dezember 2007, Berlin
29
48 24C3
24. Chaos Communication Congress
Figure 31: Histogram of sizes of generated sequences for the third set
Figure 32: Histogram of sizes of generated sequences for the third set
30
Volldampf voraus! 49
27. - 30. Dezember 2007, Berlin
belonging to any sequence. But problem with joining is choosing which sequence to join with each
another. Which sequence from those shown in Figures 26, 27, 28. 29 should be joined to the one shown
in Figure 30? It could be different case of Break function. If none of the causes for break occurs, there
is possibility of join. Another possible solution is manual joining. Program could display few candidates
and let user choose which ones look best together. If manual joining is success, this approach could be
used to change generating algorithm and allow for manual choosing of alternative sub-sequences.
Knowledge gathered during analysing data and generating sequences leaves some doubts. I started
with assumption that each tag sends packet every 1.5s. This lead to setting coefficient range from 1s to
2s. Because this was not giving good results in local algorithms, and by observing scatter plots, global
algorithms were using range from 0.0 to 10.0, and later, basing on analysing source code of Sputnik
firmware, from 1.0 to 5.0, Source code of firmware contains two calls of sleep function. One sleeps for 2s,
and another for random period from 0s to 2s. This gives range of line slopes from 2s to 4s. But because
second sleep function parameter is random value, there should be no straight line! However scatter plots
reveal many of them. So either Sputnik data contains so many points that one can draw any line, or
function rand() returns not very random numbers. Basing on analysing packets generated by single tag,
second possibility is true.
Fragment of firmware of tag
No physical (or geometrical) model was taken into consideration during generating sequences. No
distance between stations or speed of movement was analysed. This could give better results in sequences,
by limiting point to only those that are in range to reach from previous point. On the other hand this
approach would require calculating position of each tag in every moment.
31
50 24C3
24. Chaos Communication Congress
5.3 Analysis
Following paragraphs describe potential approaches. They base on validity of generated sequences.
I did not yet performed any analysis of data using generated sequences, as recovering them was my
primary concern.
XML data set proves that it is possible to calculate position of tag. Tags send packets with different
signal strength to allow for estimation of distance from reader. This estimation bases on negative
knowledge. If reader is unable to read signal with small strength it means that tag is far away from
it. So having few packets it is possible to calculate minimal and maximal distance tag is from reader.
Power of signal was set so next level of power increases twice radius of range. This gives two spheres with
small and large radius; person is between them. When data from few readers is known, it is possible to
calculate common fragment of space where all those spheres intersect, and this is position of tag. But
this requires knowing exact positions of readers.
Human body decreases strength of signal. This decreases precision of estimating position of tag.
But maybe this could be used to calculate direction person has, assuming that tag is worn in the front.
Range would not be sphere, but two hemispheres, larger in the front and smaller in the back. This would
require performing more calculations (two times for each reader), but as there is no situation when all
readers see one tag, it would not be impossible. Direction could be proven when person moves in this
direction, again with assumption that person walks forwards, not backwards.
Simple analysis is calculating time of entering BCC and leaving it. Most people leave Center for the
night, but some stay. Also when one sequence disappears and another one appears in the same place it
means that someone is playing with battery and reset tag.
The most interesting analysis is looking for connections and similarities between attendees. This can
be done by looking for people that attended similar talks. Those people may not even know each other
but have common interests.
Another research area is looking for friends. Friends can be defined as people that stay together;
they tend to be together not only during talks, but also and especially during breaks. If two people are
close during most breaks, they are close friends. If they are close for some times, and not close for other
moments, they may be colleagues. Or they may just stay in the same queue for pizza. However here the
most important is relative position (distance between people), not exact position of tags.
This data set leves many conclusions to be drawn.
32
Volldampf voraus! 51
27. - 30. Dezember 2007, Berlin
52 24C3
24. Chaos Communication Congress
Hacking
lecture
Tag 1 21:45
Saal 2
de
AnonAccess
Ein anonymes Zugangskontrollsystem
AnonAccess ist ein elektronisches System, welches anonymen Zugang nicht nur zu
Hackerspaces ermöglicht.
Mit Hilfe kryptographischer Verfahren kann das Mikrocontroller-basierende System verblüffend
einfach sicheren und anonymen Zugang kontrollieren.Es wird das Zusammenspiel verschiedener
Primitiven unter Berücksichtigung der Limitierungen eingebetteter Systeme gezeigt.
Angriffsszenarien und Anforderungen an derartige Systeme stellen einen weiteren
Beobachtungsgegenstand da.Gezeigt wird das komplette System von der ICC-Speicherkarte über
die gesicherte Kommunikation bis zur verschlüsselten Datenbank.
Volldampf voraus! 53
27. - 30. Dezember 2007, Berlin
AnonAccess
das Labor
http://www.das-labor.org
Abstract
This paper gives an overview of the AnonAccess-system, which tries
to provide access to users which may be known by name, pseudonym or a
shared pseudonym, to a given functionality (ex. open a door). The shared
pseudonym access feature is tried to be extended and implemented in such
a way that it can be claimed to be anonymous.
54 24C3
24. Chaos Communication Congress
3 Components
The AnonAccess system is divided in Terminal-Unit and Master-Unit, addi-
tionaly there is a chip-card for each user, which stores the user’s authentication
data.
3.1 Chip-Card
We use simple memory cards with I 2 C-Bus[4] and form factor ID-1 as specified
in [5][6]. They are quite cheap (less then 1e per card) and not secure. Their
contents might easily be read or modified, so everyone can read and check what
we write on his/her card.
The card contains a so called AuthBlock embedded in an ASN.1-BER[7]
octal-string object. The AuthBlock has the following structure:
Volldampf voraus! 55
27. - 30. Dezember 2007, Berlin
3.2 Terminal-Unit
The Terminal-Unit handles user inputs, displays information and reads and
writes the user’s card. It is equipped with keypad, display, card reader and a
hardware random number generator. It’power is supplied by the Master-Unit
and it should therefore not be reset even in the case of power failure.
3.3 Master-Unit
The Master-Unit keeps the databases, does the authentication and executes the
secured action (ex. opens a door).
56 24C3
24. Chaos Communication Congress
3.6 Microcontroller
We use microcontrollers from the ATmega family from Atmel[13]for both units.
They are relatively cheap and support protection of the internal memories (flash
and EEPROM) from being read through their lock-bit feature. There also
is a toolchain including GCCs[16] C-compiler and a libc implementation[17]
available for these 8 bit microcontrollers which eases the writing of the software.
The Master-Unit uses an ATmega644[14] in DIL-Package with 64KiB of
program flash, 4KiB of internal SRAM and 2KiB of internal EEPROM (100,000
rewrite cycles guaranteed).
The Terminal-Unit uses an ATmega32[15] in DIL-Package with 32KiB of
program flash, 2KiB of internal SRAM and 1KiB of internal EEPROM (100,000
rewrite cycles guaranteed).
Volldampf voraus! 57
27. - 30. Dezember 2007, Berlin
58 24C3
24. Chaos Communication Congress
Volldampf voraus! 59
27. - 30. Dezember 2007, Berlin
5 Usage
This section describes the AnonAccess system from the user’s point of view.
5.1.2 mainclose
Execute a special action (ex. closing/locking a door).
5.1.3 adduser
Add a user to the system. A user nickname must be specified. A user is added
by generating a new valid AuthBlock which is written to an empty card, and by
writing corresponding information to the TicketDB.
5.1.4 remuser
Remove a user from the system. A user nickname must be specified. If the
nickname is stored in the TicketDB the entry in the TicketDB is immediately
deleted which includes setting the exists-flag to 0. If the nickname is not stored
in TicketDB a new entry in FLMDB is generated which leads to removal of the
account when a AuthBlock is processed whichs user pseudonym matches the
generated user pseudonym.
60 24C3
24. Chaos Communication Congress
5.1.5 lockuser
Same as removing a user but instead of deleting the entry only the lock bit is
set, which will cause the system to not accept the card as valid user card.
5.1.6 unlockuser
Same as removing a user, but instead of deleting the entry, an eventually set
lock bit will be cleared.
5.1.7 addadmin
Same as removing a user, but instead of deleting the entry, the admin bit will
be set, granting admin privileges to the user.
5.1.8 remadmin
Same as removing a user, but instead of deleting the entry, an eventually set
admin bit will be cleared, so the user will not have admin privileges any more.
5.1.9 keymigrate
Initiate a key-migration, which will write the internal secret keys to the external
serial EEPROM. This might not be implemented for security reasons.
5.2 Privileges
The system differentiates between ”normal” (non-admin) users and admin users.
To execute a given task in a session, special authorisation requirements must be
met. These requirements are given as the number of users and admins which
have to participate in the session. It might be decided to restrict admin priv-
ileges to users which are known by nickname. The given example of minimum
permission levels assumes that admin privileges are restricted to users that are
known by nickname.
6 Ideal run
1. User inserts card in Terminal-Unit
Volldampf voraus! 61
27. - 30. Dezember 2007, Berlin
62 24C3
24. Chaos Communication Congress
7.2.1 Terminal-Unit
The Terminal-Unit is considered trusted, especially the connection between the
microcontroller and the card must be protected.
7.2.2 Master-Unit
The Master-Unit is considered trusted, especially the serial bus between the
microcontroller and the external serial EEPROM must be protected. Although
the external EEPROM’s content is encrypted, an attacker might gather usefull
information from the addresses which are accessed.
A The PRNG
The PRNG utilises SHA-256 as hash function. The entropy pool is 64 bytes
(512 bits) large, which is the block size of SHA-256. We specify two algorithms
which implement the functionality of the PRNG, one to add entropy to the
entropy pool and one to get a block (32 bytes) of random data.
B the Shabea-Cipher
Shabea (SHA based encryption algorithm) is a SHA-256 based Feistel-Cipher.
It was designed to securely encrypt data where a SHA-256 implementation is
available. It was important to have a small (in program space and memory
10
Volldampf voraus! 63
27. - 30. Dezember 2007, Berlin
requirement) and nevertheless secure symmetric cipher, in the case that a SHA-
256 implementation is available.
11
64 24C3
24. Chaos Communication Congress
References
[1] When is a kilobyte a kibibyte? And an MB an MiB? (http://www.iec.
ch/zone/si/si_bytes.htm)
[2] FIPS 180-2: Secure Hash Standard (SHS) (http://csrc.nist.gov/
publications/fips/fips180-2/fips180-2withchangenotice.pdf)
[3] RFC 2104: HMAC: Keyed-Hashing for Message Authentication
[4] The I 2 C-Bus Specification, Version 2.1, January 2000, original spec-
ification from NXP Semiconductors (http://www.nxp.com/acrobat_
download/literature/9398/39340011.pdf)
[5] ISO/IEC 7816-1:1998 Identification cards – Integrated circuit(s) cards
with contacts – Part 1: Physical characteristics
[6] ISO/IEC 7816-2:1999 Identification cards – Integrated circuit cards – Part
2: Cards with contacts – Dimensions and location of the contacts
[7] ITU-T Rec. X.690: Information technology ? Abstract Syntax Nota-
tion One (ASN.1): Specification of basic notation (http://www.itu.int/
ITU-T/studygroups/com17/languages/X.680-0207.pdf)
[8] 24AA512/24LC512/24FC512 1024K I 2 C CMOS Serial EEPROM,
datasheet by Microchip (http://ww1.microchip.com/downloads/en/
DeviceDoc/21754H.pdf)
[9] 24AA1025/24LC1025/24FC1025 1024K I 2 C CMOS Serial EEPROM,
datasheet by Microchip (http://ww1.microchip.com/downloads/en/
DeviceDoc/21941E.pdf)
[10] The Microchip Cooperation web presence (http://www.microchip.com)
[11] QPort-tiny specification, Daniel Otte (http://nerilex.3dots.de/
qport-tiny.pdf).
[12] Tea extensions, Roger M. Needham and David J. Wheeler, (Notes October
1996, Revised March 1997, Corrected October 1997) (http://www.cix.
co.uk/~klockstone/xtea.pdf)
[13] The Atmel Cooperation web presence (http://www.atmel.com)
[14] ATmega644 Preliminary (revision M, updated 08/07) (http://www.
atmel.com/dyn/resources/prod_documents/doc2593.pdf)
[15] ATmega32(L) (revision K, updated 08/07) (http://www.atmel.com/
dyn/resources/prod_documents/doc2503.pdf)
[16] GCC, the GNU Compiler Collection (http://gcc.gnu.org)
[17] AVR Libc Home Page (http://www.nongnu.org/avr-libc/)
12
Volldampf voraus! 65
27. - 30. Dezember 2007, Berlin
66 24C3
24. Chaos Communication Congress
Science
lecture
2007-12-30 14:00
Saal 3
en
Immanuel Scholz
Volldampf voraus! 67
27. - 30. Dezember 2007, Berlin
kbc
kab −kab
kab +
kac + malice −kab + kbc + mbob
−kac − kbc + mcharlie
0
kab + kac + malice − kab + kbc + mbob − kac − kbc +
mcharlie
= malice + mbob + mcharlie
kab kac
H T H +H = T H +T = H T +T =
T H= T =
68 24C3
24. Chaos Communication Congress
w
2w
w
2w
w
−kac −
kbc + mcharlie + kbc = −kac + mcharlie −kac
•
mcharlie −kac
•
Volldampf voraus! 69
27. - 30. Dezember 2007, Berlin
70 24C3
24. Chaos Communication Congress
p g xalice mod
p g xalice mod p
n
n−1
r1 rn−1 n−1
k− ri
lab
kab 0
signbob (kab )
signalice (kab )
Volldampf voraus! 71
27. - 30. Dezember 2007, Berlin
• x
72 24C3
24. Chaos Communication Congress
Hacking
lecture
2007-12-29 11:30
Saal 3
de
Tonnerre Lombard
Volldampf voraus! 73
27. - 30. Dezember 2007, Berlin
2.2 Synchronisierungsprobleme
Wann immer Code parallel ausgeführt wird, welcher auf dieselben Dinge zu-
greift, kann es zu Problemen kommen. Dies fängt beim Sperren von geöffneten
Dateien an und geht über den parallelen Zugriff auf Daten zwischen Threads
74 24C3
24. Chaos Communication Congress
bis hin zur Signalbehandlung. Wann immer der Programmablauf keinen roten
Faden darstellt, ist eine Form von Synchronisierung vonnöten.
2.2.3 Signalbehandlungsangriffe
Eine weitere, oft unterschätzte Form asynchroner Programmausführung sind Si-
gnale, und auch diese können unter Umständen zur Codeausführung verwendet
werden.
2.3 Formatstringangriffe
Mit Formatstringangriffen kann in den meisten Fällen erst einmal nur Speicher
gelesen werden, aber auch dieser kann bereits interessante Informationen ent-
halten.
2.4 Injectionangriffe
Wann immer mehrere Sprachen ineinander eingebettet werden, ist es ratsam,
dafür zu sorgen, dass Elemente der inneren Sprache nicht mit Elementen der
äusseren Sprache gemischt werden. Dieses Problem ergibt sich auch und vor
Allem bei benutzerkontrollierten Eingaben in Applikationen, welche in der Aus-
gabe der Applikation oder in erzeugten Befehlen repräsentiert werden.
2.4.1 Formatinjektion
Formatinjektionen sind die älteste Art von Injection-Sicherheitslücken. Hierbei
werden die Begrenzungszeichen eines Formates in einem eingefügten, nicht ge-
Volldampf voraus! 75
27. - 30. Dezember 2007, Berlin
2.5.3 Sessiondiebstahl
Eine einfache Möglichkeit, an den Account einer anderen Person zu kommen,
sei es um Daten auszuspähen, die Person zu personifizieren, oder um deren
Berechtigungen zu missbrauchen, sind oft laufende Sitzungen der Person ein
Angriffsziel. Mit Codebeispielen wird darauf eingegangen, auf welchen Wegen
man eine Sitzung einer anderen Person übernehmen kann.
Mittels SQL-Injection
76 24C3
24. Chaos Communication Congress
4 Abschliessende Hinweise
Zuletzt werden noch einige Hinweise zur Architektur sicherer Systeme gegeben.
Dies reicht von erneuter Mahnung zum Prüfen gegen Buffer Overflows bis zum
Hinweis, wie SSL-Clientzertifikate die nervigen Cookieprobleme ein für alle mal
beseitigt werden können.
Volldampf voraus! 77
27. - 30. Dezember 2007, Berlin
78 24C3
24. Chaos Communication Congress
Society
lecture
2007-12-29 12:45
Saal 1
Tomislav Medak
en
Toni Prug
Marcell Mars
Volldampf voraus! 79
27. - 30. Dezember 2007, Berlin
!"##
$
#
"
#%&&
&
'
(()#
************************************************************************
************************************************************************
+#
"*#
*
#
,*
,
-,&,#
,
#
#.
"
//////
!
.0"1
230
,
2
+#
,%
4+
5
#
##
"
# #"#
**#
,
,
6
"6
."
,
2,
7
*
#
#
#
4
//////
,"+#
"
#
,
%
"
"
#
"
*#
#
",8
$
%
16,*
-*2##
%
4
9
,,
,
2
2"
,
4
.
:
4+#
;,,9
4+60
//////
&
<
2
+#
,
#
1
&564:
4,4#*
4
4
,42
,
80 24C3
24. Chaos Communication Congress
//////
$
&=92%##
,
,#,
"+
7",
,,#
,
'
-,
"
2#
.*
#4#
,4
<6#,
#
,
,
, #
#
"
=924+$4
///////
(&)* +
;
#
0
,
"7
:.<
"
,*
,
*
>
0###>5?
$
%
#
"**"
#,#
&**
#
"
"
//////
' ,-.
,
%
+,
,
#
22
5#
2,
'-,*5A
,
2,
,
>+
-,
+53%
##
,
-,5
#
+%
,,
##
#
.#
"#
A
,
6
#"B
#
##
,",
+#
,
#
,
-,
-
,
7
Volldampf voraus! 81
27. - 30. Dezember 2007, Berlin
//////
/01
!
,C5D,
%
46#
,,"
">.
"
;
"
.#
,,"
"
"
#
,
,2"#
#
,4
;
##,
,
,#
"
>.
,
%
6 2
.
#,#
"
*
#
",
",#"
,'"#
,
//////
2
"
,
!
,C5D,
%
43-
,,
#
,
"
,
#
,
,
,
",#
,
,
A
4
;
*
#
" #
"#, #
7
,
,
//////
#
,
>
$
#,#
,
"# #
@6
"
#
C"
6
'.
"
#,
#
",
,#'
$
4;
4(()
,
#"
"
2%
,
,
'.
>, 2
*
82 24C3
24. Chaos Communication Congress
///////
'
5 &
4
1
2
2
2 #
,
;
#
,
,
##
2
##"
#,-#
#
,
#
5
,
2
##
#
,
+
,'.
,
//////
%
5*
,
,6EE7
66
2
,
'
,
#
,
526
EE7
66
,,
#
"
'#,
,
+#
#
,
,
46+#
6,
#
4
,
2
#
#
,#
,
+#
%
&
"
%
%
"
$###
,
2 #
,#
+#
,
,%
*#
,
8
*+
-
,
2
#
"
*
Volldampf voraus! 83
27. - 30. Dezember 2007, Berlin
777777
"
4 !
.
0
,
#
#
7
88
$$
#"
9
6,#6#"
9
#
#FG*41
,
4((F**6
##
#
#"
,
%
9
*,
#
#
#
#
*,,#"
,#",
#
,,
;
'
#
#
#
#
//////
,
#
,##
,
2
,
,
7
,"
#
"
"
,
;
,
2,
"
,
7.
##
."
,
7
#
##
#
#
,
,
7
,
7
>
,
#
"
#"
,
"
#,
,
7* #
#
2 #
,
*
2
#
-#
##
A
.,
2
&
#
&
&
&
.
#
,
# ,
>
'
///////
************************************************************************
************************************************************************
84 24C3
24. Chaos Communication Congress
Hacking
lecture
2007-12-28 21:45
Saal 2
en
lucy
Volldampf voraus! 85
27. - 30. Dezember 2007, Berlin
Lucy <whoislucy(at)gmail.com>
Inside the Mac OS X Kernel
Debunking Mac OS Myths
24th Chaos Communication Congress 24C3, Berlin 2007
86 24C3
24. Chaos Communication Congress
BSD kernel with memory management and API renamed to Cocoa, the Mac OS 9 API
scheduling stripped out, and process manage- “Toolbox” ported as a compatibility API (now
ment built on top of Mach tasks. named “Carbon”), “carbonized” versions of the
The problem with the Mach design was that OS 9 Finder and QuickTime technologies, plus
the kernel was slower than a traditional mono- a VMware-like Virtual Machine called Blue-
lithic kernel because of the extra kernel/user Box (“Classic”) that runs OS 9 and its applica-
context switches when a server communicated tions unmodified.
with the kernel or servers communicated with
each other. On a monolithic kernel, these were Architecture
just simple function calls. The simplest solu- The Mac OS X kernel, named “XNU” (“X is
tion for this problem is “co-location”: The per- not UNIX”) consists of three main compo-
sonality servers run in kernel mode, and com- nents: Mach, BSD and I/O-Kit.
munication is fast again. While it somewhat Mach
defeats the original idea of a microkernel, it Being the only operating system that still uses
still has the advantage of well-partitioned ker- Mach code (not counting GNU/HURD), Mac
nel components and a more modern core ker- OS X has evolved from the original code base
nel: The Mach memory management code was quite a bit, but the architecture is basically un-
later integrated into BSD. changed. Mach (“osfmk” in the kernel source
NEXTSTEP tree, which stands for “OSF microkernel”)
NEXTSTEP, which was released in a 1.0 ver- calls address spaces “tasks”, and one task can
sion in 1989, chose to go with this design. contain zero or more threads. Being policy-
NEXT had removed the core kernel parts from free, there is little information associated with
the 4.3BSD kernel and layered it on top of a task, so, for example, there is no UNIX-style
Mach, in kernel mode. This way, NEXT was current working directory or environment as-
many years ahead of the competition with sociated with it. While there are few surprises
NEXTSTEP being the first desktop/GUI oper- in the memory management code compared to
ating system that supported preemptive multi- other modern operating systems, the key dis-
tasking, memory protection and UNIX com- tinctive feature of Mach is Mach Messaging. A
patibility. At first NEXTSTEP only ran on their task can have any number of “ports”, which are
own Motorola 68K-based machines, but was interprocess communication (IPC) endpoints.
later ported to SPARC, PA-RISC and i386, One task can subsequently send a message
when NEXT started licensing it under the name from its originating port to its peer port, and
“OpenStep” to other hardware manufacturers, Mach will take care of security, enqueueing,
so it was highly portable. When Apple acquired dequeueing, network opacity (ports can be on
NEXT in 1997, they added PowerPC support different machines) and, if necessary, byte
and removed support for all architectures other swapping. For programming convenience, the
than i386; the latter would serve as the fallback Mach Interface Generator (“MIG”) can gener-
solution when Apple switched from PowerPC ate stub code from interface definitions, so that
to i386 in 2005/2006. two processes can talk to each other using sim-
ple function calls, but internally, this will be
Rhapsody and OS X
translated into Mach messages.
With Apple’s acquisition of OpenStep, many
more changes were made to the operating sys- BSD
tem which now had the interim name “Rhap- The BSD part of the kernel implements
sody”: They replaced the “DriverKit” driver UNIX processes on top of Mach tasks, and
model with the new “I/O-Kit” system, updated UNIX signals on top of Mach exceptions and
Mach 2.5 with the Mach 3.0 codebase, updated Mach IPC. UNIX filesystem semantics are im-
the BSD part with 4.4BSD and FreeBSD code plemented here just like TCP/IP networking.
and added support for the HFS filesystem and And while the VFS (virtual filesystem) compo-
Apple networking protocols to the kernel. In nent allows plugging in BSD-style filesystems,
userland, Mac OS X is pretty much the /dev infrastructure plugs right into I/O-Kit.
NEXTSTEP/OpenStep, with the native “NS” BSD exports all the semantics that an applica-
Volldampf voraus! 87
27. - 30. Dezember 2007, Berlin
tion expects from a UNIX/BSD/POSIX com- XML description of dependencies and the parts
patible operating system, like “open()” and of the kernel it links against.
“fork()”, through the syscall interface.
Since there are basically two kernels in XNU Other interesting details
- Mach with its message passing API and BSD The following sections describe some other
with the POSIX API - there are two kinds of interesting details of or around the Mac OS X
syscalls. While both use a single int 0x80/ kernel.
sysenter/sc entry point, negative syscall num- Booting
bers will be routed to Mach, while positive While PowerPC-based Macs use OpenFirm-
ones go to BSD. Note that, just like on Win- ware, Intel-based machines use EFI (“Extensi-
dows NT, applications may not use int 0x80/ ble Firmware Interface”). Both kinds of firm-
sysenter/sc directly, as this is a private inter- ware are a lot more powerful than the 16 bit
face. Instead, applications must call through BIOS still shipping on PCs. While EFI can
libSystem, which is the equivalent of libc on boot off USB and supports GPT partitioning
OS X. and FAT32 file systems, the rest of the feature
I/O-Kit sets of OpenFirmware and EFI are pretty simi-
When NEXTSTEP was ported to different lar: Both can boot off FireWire, and both sup-
architectures and was renamed to OpenStep, it port APM (“Apple Partition Map”) partitioning
got a new driver model, called “DriverKit”, and the HFS file system, as well as firmware-
which was based on the Objective C program- level drivers. BootX is the bootloader for
ming language and therefore was object ori- OpenFirmware, and boot.efi the bootloader for
ented, and allowed an inheriting hierarchy of EFI. Both can decode HFS and can therefore
device drivers: For example, there could be a read the kernel from the root partition. If there
generic IDE/ATA device driver that handled is a “KEXT cache”, i.e. a file with all prelinked
reads and writes of blocks on an IDE bus, a KEXTs suited for this configuration, that is
hard disk driver and a CD-ROM driver that newer than the newest file in /System/Library/
subclassed the generic IDE driver, and another Extensions and newer than the running kernel,
CD-ROM driver that subclassed the generic the boot loader will load this cache; otherwise,
CD-ROM driver to work around some quirks it will go through all KEXTs and load the ap-
for one specific CD-ROM drive model. This propriate ones by comparing them to the en-
architecture helps a lot to combat duplicate tries of the “device tree” which has been
code: In contrast to other operating systems passed from the firmware to the bootloader.
like Linux, a new device driver is not written Later, a KEXT cache will be written to disk to
by copying the closest match and modifying it, speed up the next boot. This is somewhat simi-
but by subclassing an existing driver binary lar but more flexible than the Linux “initrd”
and overwriting some methods with new code. approach.
“I/O-Kit” is a higher performance reimplemen- Mach-O
tation of DriverKit in a subset of C++ (no ex- Mac OS X does not use the ELF file format
ceptions, multiple inheritance, templates, run- for binaries (executables, libraries, KEXTs)
time type information). I/O-Kit supports some like practically all other UNIX systems. In-
classes of drivers in user mode. stead, it uses Mach-O, which has roughly the
KEXTs same feature set, but one interesting addition:
I/O-Kit drivers are dynamically linked at run- A single, so-called “fat” or “universal” binary
time, as so-called “KEXTs” (“Kernel Exten- can contain code for more than one architec-
sions”). KEXT can not only link against the I/ ture. So on OS X 10.5 Leopard, for example
O-Kit component, but also against other parts /usr/lib/libSystem.dylib contains code for Pow-
of the kernel. This way, filesystem and net- erPC, PowerPC 64, i386 (32 bit Intel) and
working KEXTs (NKEs) are possible. Every x86_64 (64 bit Intel). This way, a single Mac
KEXT, which typically resides in /System/ OS X 10.5 Leopard installation DVD can boot
Library/Extensions, is a bundle, i.e. a subdirec- on four different architectures, and there is no
tory which contains the actual binary and an need for “lib/lib64” (64 bit Linux) or
88 24C3
24. Chaos Communication Congress
Volldampf voraus! 89
27. - 30. Dezember 2007, Berlin
otherwise mostly equivalent to the PowerPC security for example, and it allows commercial
and i386/x86_64 versions. companies or universities to add functionality
to the kernel, either to sell it, or for research
What makes XNU great (SEDarwin, L4/Darwin).
While XNU might not be as scalable or as But the source code is not necessarily com-
tidy as other operating systems (but catching plete. The XNU source code lacks most of the
up), it is a very modern UNIX with novel ideas ARM bits, and Apple also states that other
and unique features: parts have been left out because of trade secrets
• The kernel extension ABI is stable over sev- with Intel. But a kernel compiled from the
eral major releases of the OS. open source can still be used as a drop-in re-
• Fat/universal binaries allow for a single in- placement for the shipping binary.
stall CD or hard disk installation that runs on
different CPU architectures, without the clut- Revisiting the Buzzwords
ter of duplicating files or directories. Fur- • The OS X kernel is not Mach. The OS X
thermore, 3rd party application vendors can kernel is called “XNU”, which consists of
ship a single binary that runs on multiple ar- Mach, BSD and I/O-Kit.
chitectures. • The OS X kernel is not a microkernel. Al-
• I/O-Kit allows code reuse for drivers without though Mach has been used as a microkernel
code duplication. in other projects, XNU is a very traditional
• The KEXT cache is a clean way to speed up monolithic kernel with BSD and (most) driv-
boot times. ers in kernel mode.
• The clear separation between Mach, BSD • The OS X kernel is not based on FreeBSD.
and I/O-Kit helps keeping the cost of code The BSD part is based on 4.4BSD with some
maintenance low. code from FreeBSD, NetBSD and others.
• The powerful Mach Message API is useful The OS X userland UNIX tools are mostly
for user mode applications. based on FreeBSD code, though.
• Since Mac OS X 10.5 Leopard, the i386 port • The OS X kernel is not written in C++. The I/
of OS X is the only operating system with O-Kit part is written in a subset of C++, but
full POSIX-conformance that doesn't contain Mach and BSD are written in C.
AT&T UNIX code. • The OS X kernel is not 64 bit. It supports 64
bit user mode applications on a 64 bit Pow-
Open Source & Hacking erPC or Intel CPU, but the kernel itself runs
With every minor operating system release in 32 bit mode and is bound to the 4 GB ad-
(i.e. 10.5.0, 10.5.1...), Apple usually releases dress space limit.
the whole set of source code for all compo- • The OS X kernel is Open Source, but there is
nents of the system that are under an open no live source code repository visible outside
source license. which is basically everything of Apple, and the released source does not
but the GUI. About half of these packages are necessarily contain all code, but can be com-
patched versions of common open source pro- piled into a working system.
jects (like “bash” and “perl”), the rest is Apple • The OS X kernel is UNIX, but only since OS
code, and is released under the “Apple Public X 10.5 Leopard, and only for 32 bit i386,
Source License” APSL, which is a BSD-style since this is the configuration that passed the
license. This makes it compatible with the POSIX conformance test and may therefore
standard BSD license, as well as with the use the OpenGroup's “UNIX” trademark.
OpenSolaris CDDL. But there is no live source
code repository for developers visible outside References
Apple, so there is no real open source commu- • Singh, Amit: Mac OS X Internals. A Systems
nity that does any development on the APSL Approach; Addison-Wesley, 2006.
components. But there are other uses for Open • http://kernel.macosforge.org/
Source: It helps KEXT developers debugging, • http://www.opensource.apple.com/darwinsou
it allows governmental or educational institu- rce/
tions to build their own versions, with added
90 24C3
24. Chaos Communication Congress
Science
lecture
Tag 3 12:45
Saal 3
en
Jens Kaufmann
Introduction in MEMS
Skills for very small ninjas
MicroElectroMechanical Systems or MEMS are as part of micro system technology, systems
with electrical and mechanical subsystems at the micro scale. It is basically an introduction
in the technology and in its potential for hardware hacks and potential ways of homebrew
devices.
Compared to a micro processor, a small sensor or actuator, which normally consists of just one
function a micro system combines the data acquisition, processing, and forwarding in itself. If
this micro system now contains mechanical part to interact with its environment it is considered
to be a MEMS. With constantly increasing experience in MEMS manufacturing the prices per
system dropped and the use of the highly sophisticated devices move from strictly automotive,
R&D and military applications into consumer products. The wiimote and the iPhone are just
two well known products which improve the user experience by the intelligent use of the smart
systems.The delay of invention and market introduction of MEMS is mostly caused by the
substantial investments to be done to produce this kind of device. The most technologies
commonly used until now are transfered from the microchip manufacturing. The so called silicon
Volldampf voraus! 91
27. - 30. Dezember 2007, Berlin
MEMS are always systems that consist of different components with three major functions: input,
processing and output. This is what differentiates a micro system from a micro structure, and so
therewith allowing interactions with the environment. And so this different components can be
manufactured separately (modular integration) or all on one substrate ( monolithic integration) as
shown above. [1]
92 24C3
24. Chaos Communication Congress
The typical MEMS are made out of single crystal Silicon discs. These discs are made by pulling a
circling start crystal out of a moulded Silicon bath. The rod which was manufactured will than be
sliced, lapped and polished. This ensures a bulk material of constant quality.
The typical silicon processing for MEMS is based on the lithography used in micro electronics. A
photo mask is necessary for every step in the process that requires selective exposure. The mask
can be positive of or negative depending on the chosen resist. The process flow looks always like
this:
1. superimpose photoresist
2. expose photoresist
3. develop photoresist
8. go to 1
Another widely used technology is LiGa. LiGa is the Surface micro machined Gyroscope
German acronym for Lithography, electroplating
(Galvanoformen), molding (Abformen). In the begin-
ning it was just possible by utilising high energy x-rays to expose a PMMA resist. This resist was
covering a conductive seed layer which made it possible to electroplate in the mould and so elec-
troform large 2.5D metallic structures. The electroplated structure is than removed from the wafer
and becomes a mould itself for micro injection moulding. This gives the possibility to make many
parts in a relatively cheap way. The biggest disadvantage is the necessity of a synchrotron to gen-
erate the x-rays.
Today UV LiGA uses coherent UV light and a negative resist like SU-8; which is commonly used to
achieve similar structures ("Poor mans LiGA"). The drawback with this method is the relative low
resolution because of the long UV light wavelength.
Introduction to MEMS - Jens Kaufmann 2
Volldampf voraus! 93
27. - 30. Dezember 2007, Berlin
The manufacturing of MEMS is still a large scale batch process. Even a small cleanroom
with the necessary facilities to run one process chain for silicone is between 5 and 10
million . And such a process has an intrinsic inflexibility to design changes, as they are
costly and difficult.
Errors are really costly too, so this which makes it unavoidable to manufacture tremen-
dous quantities to produce just cost-covering.
The industry experiences the same problems at the moment with a drift in the market
for tailored solutions. "Responsive manufacturing" is the weapon to face this new devel-
opment. That means that production capabilities must be build that allow producing a
product cost-effectively in a "Batch of one".
In MEMS this is even more difficult than in other industries because everything is based
on one material. The academic community is con-
stantly trying to develop new processes with new ma-
terials to enable manufacturing by smaller players
without heavy financially resources.
And this is where fabbing takes its place in future
home grown MEMS development. A fabber is basically
a 3D-Manufactuing device that allows the user to
manufacture physical free form objects. The most
ideas are based on rapid prototyping/manufacturing of
3D structures. The additive modelling generates 3D
structures by successive adding materials at the right
place. The most rapid prototyping technologies are
working with this approach like stereo lithography and STL generated spider models
fused deposition modelling. Electro deposition or made from Resin at the LTZ
chemical vapour deposition are also considered as Hannover
additive modelling. The superiority of this method
Introduction to MEMS - Jens Kaufmann 3
94 24C3
24. Chaos Communication Congress
compared to subtractive methods is due to the fact that less waste is produced and the
design space is not predestined.
Different concepts out of the rapid prototyping have proven themselves as capable of
producing microstructures. The stereo lithography (STL) for example uses a liquid epoxy
resin with a photo active linker as material. This resin is locally cured by writing with a
laser beam onto the liquid level. The cured layer sticks to the vertical moveable stage.
This stage then is sunk further into the resin so that liquid resin will cover the object and
the next layer can be cured by the Laser. No support structures are necessary. The laser
centre in Hannover, Germany has demonstrated they can produce micro parts with this
technology. [3]
Volldampf voraus! 95
27. - 30. Dezember 2007, Berlin
References
[1] "Mikrosystemtechnik fur Ingenieure" by W. Menz and P. bley, VCH, ISBN 3-527-29003-6,
Weinheim, 1993. (In German)
[3] “Metal and polymer microparts generated by laser rapid prototyping “ by Neumeister, A.;
Czerner, S.; Ostendorf, A.In: 4th international congress on laser advanced materials proc-
essing, 16.-19. Mai 2006, Kyoto. Paper No. 050873
[4] "Selective Laser Micro Sintering with a Novel Process" by Horst Exner, Peter Regenfuss,
Lars Hartwig, Sascha Klötzer, Robby Ebert.
[5] "Processing of Piezocomposites by Fused Deposition Technique," A. Bandyopadhyay, R.K.
Panda, V.F. Janas, M. Agarwala, S.C. Danforth and A. Safari, J. Am. Cer. Soc., 80, 6, 1366-
72, (1997).
[6] “ Fabrication of PLGA scaffolds using soft lithography and microsyringe deposition” by
Giovanni Vozzi, Christopher Flaim, Arti Ahluwalia and Sangeeta Bhatia, BiomaterialsVolume
24, Issue 14, , June 2003, Pages 2533-2540.
96 24C3
24. Chaos Communication Congress
Hacking
lecture
2007-12-28 17:15
Saal 3
en
Peter Molnar
Roland Lezuo
http://cacaojvm.org/ cacaojvm.org
Volldampf voraus! 97
27. - 30. Dezember 2007, Berlin
1 About CACAO
CACAO is a multiplatform Java Virutal Machine featuring a just-in-time
compiler. Although CACAO features an interpreter, by default it works in
JIT-only mode, so all code gets compiled prior to execution. The CACAO
project was started in 1997 as a research project at Vienna University of
Technology. Today the project is fully covered by the GPL v2 license.
2 CACAO Codegenerators
CACAO provides code generators for many platforms: currently code gen-
erators for ALPHA (FreeBSD, Linux), ARM (Linux) i386 (Cygwin, Darwin,
FreeBSD Linux), MIPS (Irix, Linux), POWERPC (Darwin, Linux, NetBSD),
SPARC64 (Linux), x86 64 (Linux) and s390 (Linux) are available. A code
generator has to implement a defined internal interface consisting of a set of
exoported functions and symbols and is linked in statically into the virtual
machine.
3 Java bytecode
The Java compiler does not produce machine code which can be executed
on the host CPU directly but an intermediate representation called bytecode
targeting a virtual machine. There are around 200 bytecode instructions de-
fined in the Java Virtual Machine Specification1 The most notable difference
between java byte code and usual machine code is that bytecode instructions
1
http://java.sun.com/docs/books/jvms/second_edition/html/VMSpecTOC.doc.
html
98 24C3
24. Chaos Communication Congress
don’t use registers as operands, but operate on a operand stack instead what
leads the notion of a computation model called stack machine.
The program in listing 1 manipulates the stack as shown in figure 1:
the instruction iconst 3 pushes the integer 3 on top of the stack, iconst 5
pushes 5, iadd takes the two topmost elements of the stack, adds them and
pushes the result back. The stack is growing from the bottom to the top.
The operand stack consists of 32 bit wide stack slots. A single stack
slot can accomodate a value of the primitive types boolean, char, byte,
short, int or an object reference. To accomodate a long or double value,
two stack slots are used.
Instructions are variable sized and consist at least of one byte - the opcode
optionally followed by several bytes representing operands embedded in the
instruction itself. The getfield instruction for example is used to retrieve
the value of an object’s field and contains a two byte field specifying the
fields index. The object reference is poped from the stack and the result -
the field’s value - is pushed on the stack.
Arithmetic instructions are typed and special variands are defined for the
various primitive types: (e.g. iadd adds two int whereas ladd adds two
long values).
4 Register allocation
A naive compiler would generate machine code that would map the java
operand stack to a stack located in memory. This is actually the approach
used by the Jikes RVM baseline compiler and the approach kaffe’s JIT used
to use but is suboptimal, because of the property of memory accesses beeing
Volldampf voraus! 99
27. - 30. Dezember 2007, Berlin
expensive. CACAO instead allocates the slots of the java operand stack to
CPU registers, for example stack slot 2 to the general purpose register 16.
In the case that there are more stack slots needed than registers available,
stack slots are mapped to memory locations. On RISC plattforms, they need
to be loaded into registers before usage, and stored back afterwards.
100 24C3
24. Chaos Communication Congress
and emit load s3. In case the operand was allocated to a register, they
simply return the register number, otherwise, code is generated to load the
memory operand into a scatch register and the number of the scratch reg-
ister is returned. The destination register of an operation is retrieved using
the function codegen reg of dst, which may again return a scratch register
for memory destinations and finaly emit store generates code to store the
result in case it belongs to memory. See listing 3 for an example showing the
implementation of the iadd byetcode instruction on POWERPC64.
7 Data segment
The generated code makes use of constant values: integer constants, address
constants (function entry addresses, addresses of static members). Some
architectures support immediate values of the native word size, so such values
can be embedded in the instruction flow whike other architectures have a
fairly limited range of immediate operands, so those values need to be placed
into memory. Beacause of this the executable method’s code has a block
of memory prepended called the data segment (see figure 2) holding those
constant values. On most architectures, there is one pv register reserved to
hold the procedure vector - the current method’s entry point. The values
on the data segment can then be loaded relatively to the pv register with
negative offsets, or relatively to the current program counter with negative
offsets.
The data segment of each method always contains a method header. This
is a data structure containing metadata about the method, like a pointer to a
method descriptor, the stack frame size, the exception table, the line number
table (see ?? for details).
102 24C3
24. Chaos Communication Congress
9 Compiler invocation
Beacause just-in-time compilation of methods is expensive and accounts to
run-time, CACAO tries to deffer it, simillary as it does for class loading. A
method is normally compiled the first time it is called. To achieve this, when
a class gets loaded, for each method a so called compiler stub is generated.
A compiler stub is a small piece of code, usually a single trap instruction
combined with a pointer to the method’s descriptor. Pointers to compiler
stubs are placed where method entry points would be placed normally: in
the class descriptor and in virtual function tables.
If such a compiler stub is invoked, the trap instruction causes control
to be passed to a signal handler which extracts the method descriptor from
the stub and passes it to the compiler subsystem. The compiler generates
machine code for the method and returns the method’s entry. Then, the
machine code before the call instruction is examined, to determine the method
pointer : the address where the pointer to the stub’s entry was loaded from.
This is a virtual function table entry, the data segment, or an immediate
operand in executable code. This location is then overwritten with the actual
method entry, so that further calls to the method are redirected to the newly
generated machine code.
10 Exceptions
Exceptions are an integral part of the Java language used a lot. Nonetheless
exceptions are rare events and occur irregularly.
Each method has an exception handler table associated. This table de-
scribes the start and end instruction of each exception handler directly cor-
responding to the Java language try clause. When an exception occurs at
some point in the program, a lookup is performed in the exception table.
The type of the occurring exception is compared to the type of each handler
covering the throwing instruction.
If a match can be found the handler is executed, else the exception is
propagated outside the method. For the caller this looks like a throwing
invoke instruction. As the caller of a method is unknown at compile time,
the caller has to be determined at runtime. This is achieved by looking up the
return address which is stored on the stack. The offset is known as CACAO
knows about the stack usage of each method. Stack space is allocated on
method entry and no dynamic allocation is performed.
An operation called ”stack unwinding” is performed whenever an ex-
ception is propagated to its caller. As control flow continues at the invok-
104 24C3
24. Chaos Communication Congress
ing instruction all callee saved registers have to be restored for each stack
frame unwound. Callee saved register are stored on the method stack when a
method is entered, therefore the restore operation is implemented by loading
these registers from known stack locations.
This process either terminates when an appropriate handler has been
found or the whole stack is unwound in which case the exception is unhandled
and the program will be aborted.
In CACAO no explicit code is generated for calling back the runtime
when an exception occurred but an illegal memory operation is performed.
POSIX compatible operation systems provide a signal handling mechanism
which invokes a function in this case. This signal handler tests if the memory
operation was performed intentionally and if so it calls the exception han-
dling code. In case the memory access took place unintentionally an internal
exception is thrown and the vm aborts.
When native functions have been called they could have thrown an ex-
ception too. Natives can not throw exceptions directly but have to notify the
runtime by setting a flag in the environment. When they return the envi-
ronment is checked for an exception and exception handling code is executed
when needed. Exception handling is complex because natives may call back
into Java code. The stack layout is only known in JIT code, native code has
a different stack layout and stack unwinding would fail when a native frame
is found. Therefore a chained data structure called stackframe info is built
up when invoking natives. Figure 4 illustrates this chaining. Technically
there are no stackframeinfo structures for JIT frames, as this stack layout
is known and contains all needed information already.
11 Bytecode Verification
Because the java virtual machine was designed to provide a sandbox en-
vironment, it can’t just start executing untrusted bytecode. It would be
easy to construct malicious bytecode that if executed would crash the virtual
machine. Therefore all bytecode is subject to verification prior to execu-
tion. Bytecode verification includes basic sanity checks of the class file, type
checking of bytecode instructions, checks for operand stack underflow and
enforcement of access protection as required by the java language.
• JSR, RET Another example are the jsr and ret instructions. Their
purpose is to implement the try/finally clause of the Java language.
The jsr instruction does no invoke any methods (despite its name), it
jumps to the finally block and stores the return address on the stack.
The ret instruction fetches the return address from a local variable,
for an intentional asymetry. The bytecode verifier has to treat return
addresses as an additional type to prevent hackers from returning to
an integer value they calculated.
106 24C3
24. Chaos Communication Congress
This alone are no security problems per se, but they are subtile details
which have to be implemented 100% correct to keep the sandbox tight.
13.1 POWERPC64
The POWERPC64 architecture is an enhancement of the POWERPC ar-
chitecture and offers 64 bit address space and a 32 bit compatibility mode.
All instruction have a fixed 32 bit size. Immediate values are of course even
smaller than 32 bits. As a consequence loading a 64 bit address takes more
than 1 assembler instruction.
l i s 4 , msg@highest # l o a d msg b i t s 48−63 i n t o r 4 b i t s 16−31
o r i 4 , 4 , msg@higher # l o a d msg b i t s 32−47 i n t o r 4 b i t s 0−15
rldicr 4 ,4 ,32 ,31 # r o t a t e r4 ’ s low word i n t o r4 ’ s h i g h word
oris 4 , 4 , msg@h # l o a d msg b i t s 16−31 i n t o r 4 b i t s 16−31
ori 4 , 4 , msg@l # l o a d msg b i t s 0−15 i n t o r 4 b i t s 0−15
10
11
108 24C3
24. Chaos Communication Congress
12
i n t a d d r e s s O f ( O b j ect o ) {
// e x t r a c t and r e t u r n a d d r e s s from o . t o S t r i n g ( )
}
// A r c h i t e c t u r e de pe n d e n t s i z e o f a r r a y h e a d e r
// F i r s t ar r a y e l e m e n t i s a t t h i s o f f s e t from a r r a y p o i n t e r
int arrayHeaderSize = 16;
// S h e l l code
byte [ ] code = { /∗ s h e l l code , n u l l b y t e s a l l o w e d ∗/ } ;
// V i r t u a l f u n c t i o n t a b l e w i t h 100 s l o t s
// Each e l e m e n t ( method e n t r y ) p o i n t s t o t h e s h e l l code
i n t [ ] v f t b l = new i n t [ 1 0 0 ] ;
f o r ( i n t i = 0 ; i < v f t b l . l e n g t h ; ++i )
v f t b l [ i ] = a d d r e s s O f ( code ) + a r r a y H e a d e r S i z e ;
// O b j e c t , f i r s t words p o i n t s t o v i r t u a l f u n c t i o n t a b l e
i n t [ ] o b j = new i n t [ 1 ] { a d d r e s s O f ( v f t b l ) + a r r a y H e a d e r S i z e ) ;
// O b j e c t p o i n t e r has t o p o i n t t o e l e m e n t 0 o f o b j
int objPtr = addressOf ( obj ) + arrayHeaderSize ;
13
110 24C3
24. Chaos Communication Congress
14
112 24C3
24. Chaos Communication Congress
Hacking
lecture
2007-12-28 12:45
Saal 3
de
Stefan Strigler
BeF
Originally developed by Ericson, Erlang was eventually released as open source in 1998. Although
Erlang has been around for almost ten years now, it became a rather popular programming
environment for communication platforms only recently.The talk will equip the open-minded
programmer with concepts of concurrent programming in a functional programming environment
supported by real-world examples.Despite the fact that actual code fragments will be in display,
there is no need for novices and non-programmers to be scared away.
Konzeptionelle Einführung in
Erlang
24C3
Ziel des Vortrags ist es, einen kleinen Einblick in Erlang/OTP zu gewähren, allerdings weni-
ger in der Form "Wie programmiere ich was mit Erlang?" als eher eine Antwort auf Fragen
zu liefern wie "Was macht Erlang besonders, was kann es was andere Sprachen nicht oder
nicht so gut können?". Es soll mehr um den Einsatz von Erlang in der Praxis gehen, als eine
Einführung in das Arbeiten mit Erlang zu geben (sorry, kein 'Hello World' today).
HISTORIE
Erlang was created by the Computer Science Laboratory at Ellemtel (now Ericsson AB)
around 1990. It originates from an attempt to find the most suitable programming language for
telecom applications. Characteristics for such an application include:
• Robustness - An error in one part of the application must be caught and handled so that it
does not interrupt other parts of the applications. Preferably, there should be no errors at
all.
• Distribution - The system must be distributed over several computers, either due to the inher-
ent nature of the application, or for robustness or efficiency.
(Quelle: http://www.ericsson.com/technology/opensource/erlang/)
Open Source ist Erlang seit 1998. Die Sprache wurde nach dem dänischen Mathematiker Ag-
ner Krarup Erlang benannt, wobei die Doppeldeutigkeit mit Ericson-Language (ErLang)
gewollt ist.
1
114 24C3
24. Chaos Communication Congress
PROZESSORIENTIERTE PROGRAMMIERUNG
Joe Armstrong: "The world is parallel."
In Erlang besteht die Welt aus Prozessen, die mit einander Nachrichten austauschen. Dieses
Konzept ist für uns sehr leicht zu verstehen, denn wir agieren auf ähnliche Weise: Eine Am-
pel signalisiert grün, dann fahren wir los. Oder wir fragen die Auskunft nach einer Telefon-
nummer und sie wird uns genannt. Jede Person und jedes Objekt, das irgendwie interagieren
möchte, wird so einfach als Prozess abgebildet. Eine kleine Erweiterung zur Realität stellt die
Tatsache dar, dass Prozesse, die sich erwartet oder unerwartet beenden, noch die Ursache
preisgeben; z.B. eine Ampel fällt aus, dann sagt sie als Letztes noch 'Glühbirne durchge-
brannt'. Falls ein anderer Prozess sich dafür interessiert, dann kann die Ampel passend repa-
riert werden.
In der objektorientierten Entwicklung werden Daten als Objekte und Abläufe als Use-Cases
mit Methodenaufrufen von Objekten modelliert. In aktuellen Diskussionen wird das leider
allzu oft als Gegensatz aufgegriffen, was wohl daher rührt, dass klassische objekt-orientierte
Sprachen Parallelisierung nur mittels Threads unterstützen. Erlang dagegen aber keine Klas-
sen und Objekte kennt. Im Prinzip widersprechen sich die Ansätze aber nicht. So lassen sich
Prozesse auch als Objekte begreifen. In Python werden Methodenaufrufe sowieso Nachrich-
ten genannt und sind ohnehin von jeher konzeptionell dasselbe.
Threads teilen Speicher miteinander, dessen Zugriff zum Schutz vor Inkonsistenzen mit
Locks abgesichert wird. Sollte während eines bestehenden Locks ein Fehler auftreten, muss
explizit sichergestellt werden, dass das Lock wieder freigegeben wird, ansonsten wäre der
Programmablauf beim nächsten Zugriff auf das Lock gestoppt.
Erlang dagegen kennt keinen Shared-Memory und keinen globalen Variablen, sondern Pro-
zesse kommunizieren über Nachrichten.
SPRACHLICHE BESONDERHEITEN
• Erlang ist eine sequentiell1 funktionale2 Programmiersprache.
1 sequentiell: a, b, c
2 funktional: f(e(d()))
2
und müssen vorher nicht deklariert werden. Es gibt keine globalen Variablen und keinen
von mehreren Prozessen gemeinsam genutzten Speicher.
• Die nahezu platformunabhängige Laufzeitumgebung (footnote: läuft unter Linux, ...) in-
terpretiert Byte-Code.
• Anstatt Threads gibt es Prozesse, die von der Laufzeitumgebung verwaltet werden und da-
her sehr leichtgewichtig (footnote: sowohl RAM als auch Startdauer) sind.
• Dabei stellt Pid eine Prozess-ID dar, die in einem verteilten System auch auf einen anderen
Erlang-Node verweisen kann.
Parallelisierung
z.B. typisch für Client-Server-Architektur und um Multi-Core-Systeme auslasten
Es folgt ein vergleichendes Beispiel mit vielen Prozessen/Threads mit Erlang, dann Python:
3
116 24C3
24. Chaos Communication Congress
-module(processes).
-export([max/1]).
max(N) ->
Max = erlang:system_info(process_limit),
io:format("Max. processes: ~p~n", [Max]),
statistics(runtime), statistics(wall_clock),
L = for(1, N, fun() -> spawn(fun wait/0) end),
{_, Time1} = statistics(runtime),
{_, Time2} = statistics(wall_clock),
lists:foreach(fun(Pid) -> Pid ! die end, L),
U1 = Time1 * 1000 / N,
U2 = Time2 * 1000 / N,
io:format("time for ~p processes: ~p/~p (runtime/real)~n", [N,
U1, U2]).
wait() ->
receive
die -> void
end.
output:
1> processes:max(32000).
Max. processes: 32768
time for 32000 processes: 1.56250/3.71875 (runtime/real)
import sys,os
from threading import Thread, Lock
gl = Lock()
class TestThread(Thread):
def run(self):
gl.acquire()
gl.release()
t1 = sum(os.times())
N = int(sys.argv[1])
threads = []
gl.acquire()
for i in range(N):
t = TestThread()
t.start()
threads.append(t)
gl.release()
for t in threads:
t.join()
t2 = sum(os.times())
print "elapsed cpu time: " + str(t2-t1) + "s"
4
KILLER-APPLICATIONS
Ejabberd
• High-Performance Jabber/XMPP-Server,
• clusterbar,
• Web-Administration,
• In-House Benchmarks: Ein Node auf dual Xeon 2.8GHz und 8GB Ram bedient ca.
150.000 c2s Connections.
• MXit Südafrika betreibt Ejabberd-Cluster mit 4.8M registrierten User, 9M logins und
200M pro Tag.
Tsung
• Benchmark-Tool für HTTP und XMPP
• Clusterbar
Yaws
• High-performance Webserver für dynamischen generiertent Content
• embedable
KRITIK
• Useability der Dokumentation nicht auf der Höhe der Zeit - wer mit manpages umgehen
kann, kommt aber gut zurecht
• Für Fragen, Hilfe, Support existiert (nur?) eine Mailingliste mit mittlerweile doch sehr ho-
hem Traffic. Dort schreiben aber eben auch Leute aus dem Ericsson Entwicklerteam sowie
Joe Armstrong selbst.
5
118 24C3
24. Chaos Communication Congress
GETTING STARTED
• Download und Doku unter [http://www.erlang.org http://www.erlang.org]
LITERATUR
• Joe Armstrong, Robert Virding, Cleas • http://weblogs.mozillazine.org/roadmap
Wikström, Mike Williams: Concurrent /archives/2007/02/threads_suck.html
Programming in Erlang, Second Edition, Threads suck
Prentice Hall, 1996
• http://en.wikipedia.org/wiki/Erlang_%
• Joe Armstrong: Programming Erlang - 28programming_language%29 Wikipe-
Software for a Concurrent World, The dia: Erlang (programming language)
Programatic Programmers, 2007
• http://de.wikipedia.org/wiki/Erlang_%
• http://www.thinkingparallel.com/2007/ 28Programmiersprache%29 Wikipedia
03/20/ten-questions-with-joe-armstrong (de): Erlang (Programmiersprache)
-about-parallel-programming-and-erlang/
• http://en.wikipedia.org/wiki/Declarativ
Ten Questions with Joe Armstrong about
e_programming Wikipedia: Declarative
Parallel Programming and Erlang
programming
• http://armstrongonsoftware.blogspot.co
• http://en.wikipedia.org/wiki/Functional
m/2006/08/concurrency-is-easy.html
_programming Wikipedia: Functional
Concurrency is easy
programming
• http://armstrongonsoftware.blogspot.co
• http://lambda-the-ultimate.org/node/25
m/2006/09/why-i-dont-like-shared-me
33 Generative Code Specialisation for
mory.html Why I don't like shared mem-
High-Performance Monte Carlo Simula-
ory
tions
• http://armstrongonsoftware.blogspot.co
m/2006/09/pure-and-simple-transactio
n-memories.html Pure and simple trans-
action memories
6
120 24C3
24. Chaos Communication Congress
Science
lecture
2007-12-28 16:00
Saal 2
en
Linguistic Hacking
How to know what a text in an unknown language is about?
It is sometimes necessary to know what a text is about, even it is written in a language
you don't know. This can be quite problematic, if you do not even know in what language
it is written. This talk will show how it is possible to identify the language of a written
text and get at least some information about the contents, in order to decide whether a
specialist and which specialist is needed to know more.
The talk deals with the following issues:1 How to identify a language* texts in non-latin writing
systems and how the writing system can show what language we deal with,* how to identify
languages with the help of sample texts (based on a collection of sample texts compiled for this
purpose by Soviet linguists will be used),* tricks that help to make at least an intelligent
guess.2 How to get an idea about the contents of a text* identifying (important) content words
and grammar,* quick and dirty translations,* how to translate a text from a language you
hardly know.The talk will introduce a variety of means, ranging from pre-internet (and
pre-computational) approaches to contemporary web resources.
Linguistic Hacking
How to know what a text in an unknown
language is about?
Martin.Haase@uni-bamberg.de
1 Introduction
In a first and rather brief outline, I will show how to identify the language of a written
text in traditional ways and with the help of computer technology. In the second part,
I will show how to get at least some information out of an unknown text. This is all
about linguistics, but what has it to do with hacking? I will show that some tricks must
be used to solve such problems and define hacking in this context according to Eric
Raymond’s seventh definition as “the intellectual challenge of creatively overcoming or
circumventing limitations.” [10, 234]
I will confine my analysis to written texts (not necessarily in Roman script), although,
based on a multi-language corpus of telephone calls [7], considerable progress has been
made in the identification of spoken languages [8]. The main reason for this omission
is that with spoken language it is far more difficult (and perhaps even impossible) to
get clues about the contents of a conversation without at least some knowledge of the
language in question.
122 24C3
24. Chaos Communication Congress
specific. A handbook on writing systems [4] or web resources [1] can easily help to
identify a script and thereby the language.
There are some difficult cases of course. One such case is the Hebrew script which is
used for:
• Judeo-Arabic,
• Yiddish
Of course, there are some simple tricks to distinguish between Hebrew and the other
languages. Normally, Hebrew is written without vowel diacritics (the little dots over
and under Hebrew letters). If your text shows no such signs, it is probably Hebrew.
If it contains such “vocalization signs”, it may still be Hebrew (a text from the Bible,
from a children’s book, or from learning material), but in that case the vocalization can
be consistently found throughout the text. If some words show (some) vocalization and
others don’t, it is most probably a Yiddish text, where Yiddish words contain a subset
of vocalization signs, but loan words from Hebrew are used without vocalization. Ladino
doesn’t contain super- or subscript diacritics at all. Moreover, Yiddish and Ladino texts
may contain Roman-script arabic numbers and Roman-script punctuation signs, but
sometimes even Hebrew texts contain western numbers. Figure 1 shows a Yiddish text
(few vocalization, Roman-script arabic numbers, Western punctuation), whereas figure 2
shows the same text from the Hebrew bible (with full vocalization), i. e. the beginning of
Genesis, the first book of the Bible (Hebrew numbering, full vocalization, non-Western
punctuation).
The problem gets worse when we turn to the Arabic writing systems. Variants are
used for about twenty different and partly unrelated languages (and more subvarieties)
and Modern Arabic itself has about thirty commonly used varieties. In order to get an
idea about the language, it is helpful to work with sample texts [1, 6].
The Cyrillic writing system is even worse, since it is used for more than sixty lan-
guages. Cyrillic writing systems for non-slavic languages were conceived mainly in the
middle of the 20th century. When Cyrillic was adapted to different phonological systems,
additional letters were introduced that make it easy to identify a language, because every
writing system contains different special signs. That is why the identification of Cyrillic
languages is mainly done through the identification of character encoding.
1. frequencies of unique characters and character strings: this method, known from
cryptoanalysis, classifies documents by the frequency of unique characters and the
occurrence of typical character strings; a nifty variant of this approach consists in
measuring the compression efficiency that a program such as gzip achieves when
appending an unknown document to various reference documents. [3]
2. common words recognition: this method is based on word frequency lists (gener-
ated from sample texts), the unknown text is analyzed word by word and compared
to the list of the top 100 words (or so) of the sample texts;
3. n-gram analysis: this method works like common words recognition with the dif-
ference that (instead of words) sequences of n characters are used (2-character
sequences, 3-character sequences, etc.): if we split the word text into 3-grams, this
would be the result: ( TE), (TEX), (EXT), (XT ), denoting the word boundary.
These approaches all work according to the scheme in Figure 3: a document model is
generated from the input text in the unknown language and then this model is compared
to the existing models generated from sample texts.
The advantages and shortcomings of this procedure can be critically evaluated [5]:
the main drawbacks are that only a closed class of languages can be identified (dialects
and varieties of these languages are usually ignored), and normally, multilingual text
cannot be processed. If the programs work for non-Roman scripts, they usually reduce
the recognition of non-Roman script languages to the detection of the encoding which
doesn’t work if a writing system is used for several languages and if non-standard or
mixed character encodings are used.
Here is a list of free software readily available (and running) on the internet [5, 12, 13]:
124 24C3
24. Chaos Communication Congress
• LanguageGuesser (http://www.xrce.xerox.com/cgi-bin/mltt/
LanguageGuesser) provides for the web-based identification of about 40
languages, based on statistical methods (frequency tests on characters and
character sequences) [2],
• look for things you recognize without any help: numbers, dates, words from another
language; a number or a date can be a good hint; if it is a precise number or date,
a quick look-up with your preferred search engine might be helpful,
• look for typographic hints to important content: bold or italic print, colored or
underlined text chunks, capital letters (they may indicate names that you may
recognize or look up in Wikipedia).
Even with these steps you can get important hints about the contents of the text.
Moreover, the principle of least effort or Zipf’s law [14] can be very helpful to find
out what a text is about: Very frequent words are shorter and contain less lexical
information, whereas infrequent words are longer and contain more lexical information;
moreover, less lexical information implies more grammatical information and vice versa.
For our purpose, we are looking for words with more specific lexical information. So we
can ignore all short words, even if they reiterate throughout the text. A longer word
that is repeated is therefore more interesting. gagana Here is an example (from Samoan,
which is difficult to identify as such, since it is not contained in typical language sample
collections):
Ua salalau lenei gagana i le lalolagi atoa. ’O lenei fo’i gagana, ’ua ’avea ma gagana lona lua a le
tele o tagata ’o le vasa Pasefika, e pei ’o Samoa. E iai le manatu, ’o le gagana fa’aperetania,
’ua matuā talitonu i ai le tele o tagata Samoa e fa’apea ’o le gagana e maua ai le atamai ma le
poto. ’E talitonu fo’i nisi o i latou, ’e lē aoga la latou gagana. E lē sa’o lea tāofi, ’auā e ’avatu le
gagana fa’aperetania i Samoa, ’ua leva ona atamamai ma popoto tagata Samoa e fai lo latou
soifua ma lo latou lalolagi.
The interesting words in this text are gagana and fa’aperetania, perhaps latou too,
although this is short enough to be a more grammatical item. It is difficult to find
a Samoan dictionary, but a quick search reveals that fa’aperetania means ‘English’
(8th Google result) and gagana ‘language’ (11th & 13th Google hit); latou is more
difficult to find and less useful, since it is a third person plural pronoun (as the French
Wiktionary reveals). So the text is about the English language, probably in Samoa
(“gagana fa’aperetania i Samoa”).
The example shows that it is rather simple to get at least minimal information out of a
text whose language is unknown to us, even if we don’t have direct access to a translator
or a dictionary.
References
[1] Omniglot. Writing Systems and Languages of the World. http://www.omniglot.
com/ (2007-11-16).
126 24C3
24. Chaos Communication Congress
[2] K.R. Beesley. Language identifier: A computer program for automatic natural-
language identification of on-line text. Language at Crossroads: Proceedings of
the 29th Annual Conference of the American Translators Association, pages 12–16,
1988.
[3] D. Benedetto, E. Caglioti, and V. Loreto. Language Trees and Zipping. Physical
Review Letters, 88(4):48702, 2002.
[4] P.T. Daniels and W. Bright. The world’s writing systems. New York etc.: Oxford
University Press, 1996.
[6] N.C. Ingle. Language Identification Table. London: Technical Translation Interna-
tional, 1980.
[7] Y.K. Muthusamy, R.A. Cole, and B.T. Oshika. The OGI multi-language telephone
speech corpus. Proceedings of the International Conference on Spoken Language
Processing, pages 895–898, 1992.
[8] Y.K. Muthusamy and A.L. Spitz. Automatic language identification. Cambridge
Studies In Natural Language Processing Series, pages 273–276, 1997.
[10] E.S. Raymond. The New Hacker’s Dictionary. Cambridge, Mass.: MIT Press, 1996.
[14] G.K. Zipf. Human Behavior and the Principle of Least Effort: An Introduction to
Human Ecology. New York: Hafner, 1965.
128 24C3
24. Chaos Communication Congress
Science
lecture
2007-12-28 18:30
Saal 3
en
Florian
In 2005, courtesy of its creators at Blizzard Entertainment, the ancient Blood God "Hakkar the
Soulflayer" unleashed a devastating plague, "corrupted blood", upon a totally unprepared
population of avatars. Unintentionally, the digital "black death" spread to cities and depopulated
whole areas. The epidemic could only be controlled by shutting down and restarting the game
world, a measure unfortunately not available in the "real" world. However, other measures such as
quarantine or improved treatment are available in the real world and can be simulated by disease
modelling. Disease modelling is essentially a virtualisation of reality that tries to gain insights into
hitherto unknown inderdependencies and to simulate intervention scenarios.I will give a brief
overview of the use of infectious disease modelling in a population and explain the disease
dynamics of the "corrupted blood" epidemic in WoW. I will focus on cross references to the "real
I will begin with a brief introduction to modelling diseases, describe how I modelled the „corrupted blood“
plague of the online game World of Warcraft and finish with a few ideas on future virtual epidemics.
The challenge with a SIR model is to estimate the flow between different compartments, most notably
between S(usceptibles) and I(nfetious), which will be explained in more detail. For simplicity, birth rate and
natural death rate are ignored (closed population).
Assuming homogenous mixing, the overall contact rate is c. Since we are only interested in contacting
infectious, we multiply with the proportion of infected I/N (where N=S+I+R = total population).
However, meeting with an infectious does not always result in an infection event. This only happens with a
transmission probability p. For tuberculosis for example, one would have to meet approximately 20
infectious people before contracting the disease whereas measles or Ebola have a transmission probability
close to one. The term p*c is also called „beta“ or "force of infection“.
So far, we have p*c*I/N which corresponds to the rate of transmission from infectious. The total
transmission rate in a population is the number of susceptibles S multiplied by that rate, finally yielding
p*c*I/N*S. N, p, c are constants, S and I are state variables and change with time, making the whole
system non-linear as mentioned above.
The "flow" from compartment I to R is simply the inverse of the duration of infectiousness (D), usually
called delta. For example, if one remains infectious for 10 days (D=10) and time is counted in days, then
1/10 per day (1/D) of I flows to R. However, compartment I also looses individuals due to death at the
disease specific death rate sigma. Here, sigma is set to zero.
Summing up, compartment S "looses" individuals at a rate of p*c*I/N*S, compartment I gains individuals
at that rate but looses individuals at rate delta to compartment R. Compartment R gains individuals at rate
delta.
These rates are put into a system of differential equations which are solved numerically by computer
programs such as Berkeley Madonna (http://www.berkeleymadonna.com/).
In formula (dS/dt means change of S over time, no birth rate, no natural or disease specific death rate):
130 24C3
24. Chaos Communication Congress
dS/dt = -p*c*I/N*S
dI/dt = p*c*I/N*S - delta*I
dR/dt = delta*I
The SIR model is suited for infections that generate immunity (R compartment). If immunity is lost with
time, one would use a SIRS model where the „waning immunity“ rate would determine the „flow“ from
compartment R to compartment S back again.
Most sexually transmitted infections such as syphilis, gonorrhoea or chlamydiasis but also the „winter
vomiting disease“ caused by Norovirus generate no or only partial immunity. S(usceptible) become
I(nfectious) and after curing the infection S(usceptible) again, resulting in a SIS model. Diseases such as
Hepatitis C or HIV (!condoms protect!) cannot be cured and leaves people I(nfectious), yielding a SI model.
Corrupted Blood
Hakkar the Soulflayer
On September the 13th, Blizzard Entertainment released new gaming content for their acclaimed massively
multiplayer online roleplaying game, „World of Warcraft“ (WoW). For the sake of brevity, basic knowledge
about WoW is assumed.
A new map region called „Zul Gurub“ with a new challenging end-game opponent „Hakkar the Soulflayer“
were waiting for high level players. During battle, Hakkar cast a spell called „corrupted blood“ (CB) on a
random player that hit with severe damage once and additional smaller damage over time (DOT). DOT-spell
are not uncommon in Wow, however totally new was the ability of the spell to get „transmitted“ to nearby
players and their „pets“ (fighting companions). The spell was infectious. The original intention of the game
designers might have been to force players to spread over an area and thus let the infection run out by
eliminating contact between players. What happened was that once infected player teleported back to
populated cities or hunters (special classes) summoned back their infected pets, CB spread like the famous
black death and depopulated whole areas. Worse still, non player characters like in-game shopkeepers or
guards got infected as well. The game designers first tried to quarantine the disease but ultimately failed and
had to shut down the virtual world and reload it with a non-infectious version of CB. The CB-incident caught
a lot of media attention and fuelled discussion on using online games as epidemic simulators.
Modelling CB
First, it has to be said that any epidemiological modeller could have predicted the devastating effects of CB.
The basic reproductive rate R0 was so absurdly high, that any natural pathogen would have killed its host
population and thereby sealed its own fate: no host, no pathogen.
Model parameters usually have to be estimated from observational data. To the great dismay of the
epidemiological community, no observational data on CB incidence is available from Blizzard. However,
with a programmed disease like CB, parameters are available directly. Duration of the disease, providing
survival, was 10 seconds. Low and mid level players died after two hits by the disease that was 4 seconds.
Transmission probability was one, that is everyone in vicinity of an infectious got infected as well. Not even
Ebola is that contagious. Contact rate depended on geographic location. In special WoW meeting places in
cities like the auction house, a contact rate of 5 players per second is not uncommon. Outside cities, contact
rate was lower.
It might seem confusing to think of dead players as recovered, but in terms of disease modelling, they cannot
be infected while on the graveyard and are thus, for the sake of CB, recovered.
The graphs in fig. 2 illustrate the course of the epidemic with different contact rates.
A: one infected at start, contact rate 2/s, resulting in 85% of players wasting their subscription fee on the
graveyard with a slightly diminished in-game experience.
B: 500 infected at start, contact rate 1/5s, epidemic dies out because of R0= D*c*p=4*1/5*1, which is <1. In
words, each infected creates less than one secondary infection.
Run 1: 17500 steps in 0.0167 seconds Run 1: 2500 steps in 0 seconds
3000 2500
2500
2000
Susceptible:1
2000 Infected:1
Graveyard:1
1500
1500
1000
A Susceptible:1
Infected:1
Graveyard:1 1000
B
500
500
0 0
0 50 100 150 200 250 300 350 0 5 10 15 20 25 30 35 40 45 50
TIME TIME
Figure 2: SIRS dynamics depending on contact rate. Susceptible black, infectious thin dotted, recovered
thick dotted
132 24C3
24. Chaos Communication Congress
The graphs in fig. 4 illustrate the course of the epidemic with different contact rates.
C: one infected at start, contact rate 2/s, resulting in 95% of players staying infectious.
D: 500 infected at start, contact rate 1/20s, epidemic dies out because of R0= D*c*p=10*1/20*1, which is <1
(D is 10 seconds and not 4 as in the SIRS cases A and B, as high level Avatars survive the full duration of
the spell).
Run 1: 1500 steps in 0.0167 seconds Run 1: 6000 steps in 0.0167 seconds
3000 3000
2500 2500
2000 2000
1500
1000
C Susceptible:1
Infected:1
1500
1000
D Susceptible:1
Infected:1
500 500
0 0
0 5 10 15 20 25 30 0 20 40 60 80 100 120
TIME TIME
Figure 4: SIS dynamics depending on contact rate; susceptibles black, infectious dotted
134 24C3
24. Chaos Communication Congress
Society
lecture
2007-12-30 12:45
Saal 3
en
Olivier Cleynen
A brief reminder that although free software outperforms proprietary products in many respects, it
still remains a minor player in the market. We develop the most stable, trustworthy, usable software in
the world, and yet we fail to get past the 1% mark almost everywhere.
Perhaps most telling is the success of Microsoft Vista, whose supposedly poor performance we love to
describe. In the first month of sales, Microsoft sold 20 million units. That's more Vista sales in one
month than there has been GNU/Linux users in ten years.
So it's possible that we lack something to make a difference, and clearly it's not “good software”.
If we are to make a difference we have to solve or get around four problems.
1. Nobody chooses software
This fact is often forgotten because we typically are people who care so much about software that we
build our own. But in our society our consumer lives are getting so impossibly complicated (there is a
decision to make for just any purchase, from potatoes to batteries) that by the time they come home in
the evening people don't want to worry about software. We have to be already “inside” when Joe buys
his computer.
2. We'll never have a killer app
Because of the nature of free software, ideas and code flow quickly and we typically will never have a
killer application (they get ported too quickly). We continually forget about this, however, and keep
trying to build it anyway (ie. trying to make the perfect, ultimate unique application).
3. The legal environment is hostile
This is summed up in one sentence: in most countries you cannot play MP3s and DVDs with free
software, legally. The code is here but the patent/DRM laws prevent using it legally. Until this is
changed, free software will never make it to the shelves of any large-scale store.
4. The OS is disappearing
Because online services are typically well-designed, practical and sexy, we are losing hold of the
“real” operating system. There will always be software needed to run the PC chips, of course, but all
of the interesting software, with which we exchange ideas, produce work, and build our culture, is all
progressively being transferred to private servers. Just ask how many people in a room full of
developers regularly use Google apps, and how many use proprietary-software-devices to access some
kind of closed network (in their car, pockets, or living room).
Unless we put our focus out of personal-computer-centric software, we are at risk of missing this
change in computing trends.
136 24C3
24. Chaos Communication Congress
Making a real difference in the market means “tackling Joe”, the everyday user who has better things
to do than worry about the status of his software's code repository. Two points here:
1. Talk to Joe. The fact is our community is so much focused on software stability and choice,
that we shut ourselves on an entirely different planet. Perhaps insisting more on usability,
absence of viruses, and simple, easy choices (ie. killing Distrowatch) is the first thing to do.
2. Be relevant. Source code is the least of concerns for 95% of users out there. Speaking of “free
software” instead of “open-source” makes much more sense and does make a big difference
whenever the Joe has to make a decision.
Getting back to basics, speaking a language that is relevant to Joe, is the sole focus of GNU/Linux
Matters, a non-profit which aims to explaining Linux and free software to 1 million people in 2008.
The goal of this section is to introduce some “business-thinking” into software development. Because
our software is available at no cost, we fail to think in terms of market, customer expectation, or
segmentation.
On the proprietary side, knowing exactly what the consumers want and how much they are ready to
pay for it is a priority. The products then stem from this analysis (for example, the various Vista or
Photoshop versions).
In the free software world... we are often simply too busy forking to worry about what the users want.
This is because of The v0.12 Syndrome, whose symptoms are 1. A total dedication to quality (“the bug
tracker is the project”) 2. An agenda driven by the progression of the software (instead of the
opposite, ie, “it's released when it's ready”) 3. An overwhelming tendency to fork (whenever
somebody disagrees on how the code is written). The result: high quality, stable software that's
perpetually in a v0.12 state, and ten miles of altitude separating developers from users.
We'll start to break through when we realize that quality never has been a decision factor for the end-
user. For example, OpenOffice.org is bloated but seduced 100m users (and is a major player in
opening standards) because of good market analysis: being just like MS Office was the requirement
there. Similarly, the only difference between Firefox and the low-profile Mozilla suite was some wise
market analysis – a few cuts and some branding, not better quality, has made all the difference.
Concluding remarks:
Making a lasting dent into the overwhelming domination of proprietary software in the market does
not require writing better code. What we lack is better market analysis: a more tactical perspective in
the development of our projects, and a focus on what the users want. Giving up quality to work on
differentiation, and adapting to the online world are two of the biggest requisites for that.
Talk given by Olivier Cleynen from GNU/Linux Matters, CC-BY-SA 2007. To learn more about us,
visit http://www.gnulinuxmatters.org/ .
138 24C3
24. Chaos Communication Congress
Science
lecture
2007-12-27 12:45
Saal 3
en
Mark Vogelsberger
Ordinary matter made up of baryons give only 4% to the total content of the Universe. The talk
will present recent results with the main focus on computational methods and challenges in that
field. A state-of-the-art computer code for running these calculations will be presented in detail.
The talk will describe recent progress in the field of cosmic structure formation and will mainly
focus on computational problems and methods carrying out such large simulations on the fastest
Supercomputers available today. At the end of the talk I will also briefly discuss a new method
we developed to access the dark matter structure in the Milky way to a scale that was just
impossible some month ago with current Supercomputers.To describe the evolution of the
Universe from the Big Bang to what we see today is a quite hard task. [...]
The following text is a very brief introduction into the field of cosmological Super-
computer simulations. Those who want to dig deeper into the field should consult the
references at the end.
1 The Universe
The goal of cosmological simulations is to model the growth of the structures in the
Universe. In other words, these simulations allow us to compress the long times of
cosmic evolution into a human lifetime and they can be considered as an experimental
tool to verify theories of the origin and the evolution of our Universe.
Today we believe that this evolution started with a Big Bang. Shortly after this
event small fluctuations were imprinted into the radiation and matter density field. To
understand the Universe, how it looks today, we need to know how these small per-
turbations to an otherwise homogeneous and isotropic space evolve with time. This
calculation is highly complex and can only be done numerically using large comput-
ers. Analytic methods can only be used in the linear regime but for the whole evolution
of the Universe numerical methods are needed. To run such cosmological simulations
one needs two main ingredients: first it is necessary to specify initial conditions, to tell
the computer where it should start to calculate. On the other hand one has to tell the
computer also how to calculate the evolution of the Universe. The initial conditions for
the simulation can be observed. How can we do this? We get the initial conditions from
the afterglow of the Big Bang. About 300.000 years after the Big Bang the radiation
could decouple. This radiation is still visible today. Due to the expansion of the Uni-
verse we can observe it today at an temperature of about 2.7 Kelvin. Modern satellite
missions could resolve small fluctuations in this radiation. From these fluctuations it
is possible to infer the perturbations in the initial density field of the matter. Thus we
know how the initial density field 300.000 years after the Big Bang looked like. This
is the input of our simulation. From this initial density field we have to evolve the
Universe from the starting point to today, about 13 billion years after the Big Bang.
The leading force for this evolution is gravity in an expanding space. Cosmolog-
ical codes use particles to trace the density field and evolve them under their mutual
gravity. As the simulation samples the smooth density field with such a finite set of
particles these computer simulations are called N-body codes. The more particles you
have the better the resolution you get. This is why there is a constant competition in
getting the highest number of particles and the computational resources you need to run
these calculations require the largest computers available today. I will focus here on the
simulation of the gravity only. This is by far the most important process and also the
easiest thing to simulate. Note that there is also baryonic gas in the Universe - we are
for example made out of baryons. Everything you can see like stars, galaxies, planets
and so on are made of baryons. Their dynamics is also influenced by hydrodynamics
and complicated gas physics. This is a lot more complicated to deal with. Modern sim-
ulation codes are also able to treat the baryons and compute a Universe with galaxies.
140 24C3
24. Chaos Communication Congress
They allow to form stars and solve the gas physics. The cosmological code Gadget
(Springel, 2005) that was developed at our institute is public available and can solve
both gravity and hydrodynamics. This is still quite restricted, because there are lots
of processes going on that need to be taken into account to get more realistic pictures:
black holes, cosmic rays, radiative transfer, magnetic fields and so. The current inter-
nal production version of the Gadget code has more than 200 options corresponding to
physical processes you can turn on or off. But the main evolution of cosmic structure
does not need gas physics. It can purely be calculated using the gravitational force in
an expanding Universe.
The fact that we can ignore the baryons for structure formation is because they
only make up four percent of the total energy content in the Universe. The largest mass
component comes from what is known as Dark Matter. It is called dark, because it does
not shine like stars or gas. It is invisible and therefore called dark. Today we know that
about 23 percent of the Universe are made up of this Dark Matter. Dark Matter only
interacts by gravitation. This is why we can indirectly observe it by its gravitational
interaction on visible objects like galaxies and gas. For example, Dark Matter can act
as a gravitational lens and can deflect light from visible galaxies. Besides baryons
and Dark Matter the largest component of the Universe consists of Dark Energy. In
Einstein’s equations of general relativity this corresponds to the so called cosmological
constant. Due to the small fraction of baryons in the Universe most simulations of
structure formation only take into account the dark components, so Dark Matter and
Dark Energy. Based on physical models and assumptions galaxies, stars and gas can
be added in a post processing by so called semi-analytic codes. These codes take the
output of the N-body simulations and use physical laws to infer the baryonic physics.
At the moment simulations start also to explore more and more the gas physics because
the relevant codes are good enough and available machines are fast enough to simulate
both gas and Dark Matter within one simulation.
Although we are very sure that there is Dark Energy and Dark Matter, we actually
do not know what these main components of the Universe are made of. Dark En-
ergy is very mysterious and for Dark Matter we have some particle candidates that are
well motivated from particle physics. These are particles that are beyond the Standard
Model of particle physics, like supersymmetric particles.
The fact that lots of structure formation simulations only take into account the
dark components means, that the simulation particles represent the Dark Matter density
field. Dark Matter behaves as a collisionless fluid and one needs to take some care to
model this correctly. Therefore every particle in the simulation is not treated like a
point source of a gravitational potential. The force is softened to avoid what is called
two-body relaxation. This is needed to preserve the collisionless character of the Dark
Matter fluid. One has to take into account one very important fact when representing
the Dark Matter density distribution by a discrete set of particles. These particles are
not real Dark Matter particles. Typical masses for some proposed Dark Matter particles
are in the range of 100 GeV. The mass of the particles in the simulation are in the range
of thousands of solar masses. It is totally impossible to simulate each Dark Matter
particle on its own. So to speak the particle distribution of the Dark Matter fluid is only
a Monte-Carlo representation.
After running the simulation its output can be statistically compared to observa-
tions. The important point is that both statistics show very good agreement. An agree-
ment of those statistics then proves that our model of structure formation that we have
put into the computer simulation is correct.
2 Some details
Gravity is the dominant force at large scales. At the beginning of the Universe there
were small density perturbations. These were magnified by gravity during the evolu-
tion of the Universe. The main gravitational effect comes from Dark Matter, only at
smaller galaxy like scales baryonic physics has to be taken into account. To simulate
the Dark Matter one has to solve the equations for gravity in an expanding Universe.
Normally the expansion is taken into account by a tricky time integration scheme and
the coordinates in the simulation are so called comoving coordinates. These are the
physical coordinates rescaled by the current size of the Universe. The main challenge
for the force calculation lies in the long range 1/r2 character of the gravitational force.
The long range character implies that every particle in the simulation feels every other
particle. This results in N 2 force interactions. Typical particle numbers for cosmo-
logical simulations that are required, are too high to solve this N 2 problem. Without
clever techniques to reduce the N 2 for these so called Particle-Particle methods (PP) it
is therefore impossible to run such a simulation. The PP method only works for quite
low number of particles. With special hardware it can also be used for higher number
of particles. So called GRAPE chips are specially designed to calculate the gravita-
tional force with an extreme speed. Using special hardware like this it is possible to
use PP methods also with higher number of particles. But this is still by far not enough
for cosmological structure formation applications.
A very common method to solve this problem is the Tree method. The idea is that
the force of a distant group of particles can be approximated by the force of the center
of mass force of that group. This approximation reduces the scaling of the number
of calculations from N 2 to a lot better N log(N ). The question is how to arrange the
particles in an efficient way. A good way is the so called Tree method. For that the
simulation volume is divided into smaller cubes with 1/8 the volume each at every
stage till the smallest cells have only one particle in them. The question for the force
calculation is then whether to open a cell, or whether it is fine to take a whole group for
the force calculation. Cells that are far away from the point of force evaluation do not
have to be opened. Nearby groups need to be opened. To decide on whether to open
or not is given by a so called acceptance criterion. This criterion in the end determines
the force accuracy you get.
Another very popular method to calculate the gravitational forces are so called
Particle-Mesh (PM) methods. In fact they were the first methods used to run larger
cosmological simulations. These methods use the fact that the Poisson equation rele-
vant for the gravitational forces is a simple algebraic equation in Fourier space. With
a Fast Fourier Transformation (FFT) the forces can be calculated very fast. The FFT
requires sampling functions at uniformly spaced points. A grid/mesh is used for this.
In the simulation particles are used for representing the density and velocity field. This
means that the density field at the mesh points has to be interpolated. The fact that
both particles and meshes are used in the simulation gives this technique its name. The
Fourier method has some advantages: it automatically implies periodic boundary con-
ditions, softens the forces at small scales because of the mesh resolution and the FFT
can easily be parallelised. These points are very important for cosmological simula-
tions. But PM methods have also very critical disadvantages: the softening on mesh
scales is very fine because softening is needed to simulate the collisionless Dark Matter
fluid, but this also means the the PM code cannot resolve scales below the mesh scale.
This is a very serious limitation of the dynamical range of PM simulations. An exten-
sion of classical PM methods are so called Adaptive Mesh Refinement (AMR) codes.
In these methods the grid is refined in higher density regions. This way the resolution
is increased where it is needed.
142 24C3
24. Chaos Communication Congress
Figure 1: Dark Matter density field. This is a slice through the Millennium Simulation
(see references). One can clearly see that the Dark Matter shows a filament like struc-
ture. There are also very dense and under dense regions. These under dense regions
correspond to very large voids in the Universe.
Another possibility to get rid of the low resolution on mesh scales is to combine
the mesh method with a particle based method. This means that the “bad” forces of
the mesh on small scales are corrected by a summation of the direct particle forces
for close neighbors. These methods are called PP + PM = P3 M methods (Particle-
Particle plus Particle-Mesh). The direct summation of the PP part can also be replaced
by a Tree based method. These codes are then called hybrid codes. A very efficient
hybrid method is the TreePM method. It uses a force splitting between short and long
range force. The short range force is calculated with a Tree whereas the long range part
uses the PM method to calculate the forces.
The algorithm for the force calculation is only one problem in simulations. An-
other important issue is the so called domain decomposition strategy to divide the
work between lots processors. Cosmological simulations are often run with a num-
ber of processors of the order of 1000. The goal is to reach optimal load and memory
balance. There are different schemes around. The cosmological code Gadget uses a
fractal space-filling Peano-Hilbert curve as decomposition scheme.
Once all the forces are calculated the simulation can be advanced one time step.
The time integration algorithm that is mostly used is a quasi-symplectic leapfrog.
Cosmological simulations have to face lots of other technical issues like for ex-
ample I/O issues, because the data needs to be stored in parallel, because the typical
snapshot size is extremely large.
on a 512 CPU cluster. After finishing the simulation lots of scientists started to analyze
it and they still do until today. The amount of data is very large and the simulation
gives us a perfect tool to test our models and see whether they are correct or not. The
simulation was done with the Gadget code. Fig. 1 shows one output of the simulation.
It is the Dark Matter density field of a slice through the simulation box.
4 Further reading
1. How to simulate the Universe in a Computer (Alexander Knebe)
http://arxiv.org/abs/astro-ph/0412565
2. Cosmological N-Body Simulations (J.S. Bagla, T. Padmanabhan)
http://arxiv.org/abs/astro-ph/0411730
3. Cosmological N-Body simulation: Techniques, Scope and Status (J.S. Bagla)
http://arxiv.org/abs/astro-ph/0411043
4. Millennium Simulation (Springel et al)
http://www.mpa-garching.mpg.de/galform/press/
144 24C3
24. Chaos Communication Congress
Hacking
lecture
Tag 2 17:15
Saal 2
en
Jens Kubieziel
To be or I2P
An introduction into anonymous communication with I2P
I2P is a message-based anonymizing network. It builds a virtual network between the
communcation endpoints. This talk will introduce the technical details of I2P and show
some exemplary applications.
I2P has a different approach than most other known anonymous applications. Maybe you know
about the anonymisation networt Tor. Here you have central directory servers, onion routers
(relaying traffic), onion proxies (send and receive data from the user) and other software roles
within the network. I2P calls every software a router and it can send and receive data for the
user as well as relay traffic for other users. Furthermore I2P uses no central server for
distributing information about routers. You'll get the information from I2P's network database.
This is a pair of algorithms which share the network metadata. The routers participate in the
Kademlia algorithm. It is derived from distributed hash table.My talk will tell you in detail how I2P
work, what roles routers, gateways, netDb etc. plays. Furthermore I'll show differences and
similarities to other anonymizing networks e. g. Tor and introduce some exemplary applications.
To be or I2P
Jens Kubieziel <jens@kubieziel.de>
2007-12-27
Abstract Many of you may know about Tor or JonDo. These are widely deployed
anonymising systems. Another promising approach is I2P. This paper will show the
basic concepts of this network and introduce some applications.
146 24C3
24. Chaos Communication Congress
148 24C3
24. Chaos Communication Congress
HTTP proxy to localhost with port 4444 end add any desired forums. Syndie con-
and enter “normal” domain names. tains a button labelled Post. Click on it
and write your postings.
4.2 Email
4.4 Chat
For email there is a web interface or
you can also use your mail client. The main chat protocoll is IRC. Point your
An email address in i2p has the form chat client to localhost with port 6668 and
username@mail.i2p. The username choose a channel.
can be freely chosen. Just go to the Post-
man HQ6 and create a new mailbox. This 4.5 File sharing
site also has instructions on how to setup
your mail client. Once you are ready, you There are several clients for several net-
can send emails. Another way to send works. I2PSnark is bundled with I2P
your emails is to use the web interface and offers you access to Bittorrent. Fur-
called Susimail. Just log on with your thermore the developers of Azureus have
username and password. written azneti2p, which is also a Bittor-
You can also use I2P to communi- rent client. I2Phex is a port of the Phex
cate with the outside world. I2P mail Gnutella client and, lastly, IMule allows
can connect to an internet mail server7 access to eMule.
where it rewrites your email address with
username@i2pmail.org. The receiver
can answer it. The mail server will restore
the domain name to mail.i2p and for-
ward it to your mailbox.
4.3 Blogging
Syndie is a censor resistant, anonymous
blogging tool. You can write postings
which are then published on your local pc
and on distributed archives. The software
is not part of the I2P distribution. It can
be downloaded from http://syndie.
i2p/ and, like I2P, is written in Java. Af-
ter installation is finished, the software has
to be configured. If you only want to read
other postings, you can subscribe to the fo-
rum. In case you also want to publish blog
postings, more work must be done. First
choose a nickname, then choose how Syn-
die connects to archive servers and in the
6 http://hq.postman.i2p/
7 mx.i2pmail.org
150 24C3
24. Chaos Communication Congress
Culture
lecture
Tag 1 23:00
Saal 3
en
SkyOut
VX
The Virus Underground
The listeners will be introduced in the world of virus coding. They will understand how this
can be seen as a way of expressing yourself and why it is a way of hacking. Furthermore
they will get to know, which important groups, authors and viruses have been there in the
last years and which are still active nowadays. Important technical terms will be explained
as well as trends of the last years and the future.
The aim of the lecture shall be to introduce to the world of the virus underground. They shall
understand how this little community of about fifty people think and act and why they code
viruses. The audience may understand coding of viruses as a type of hacking and a way of
expressing it as art. Furthermore it is the aim to make them familiar with different words, that
are typically used by Virus Coders (VX), for example Appender, Prepender and Overwriter Virus.
Even more different aspects of multiplatform malware and payloads shall be explained. Then the
audience shall be introduced to different authors and groups of the scene, that are somehow the
idols of many VXers, groups like EOF, DoomRiderz and more. People like Roy G Biv, Virusbuster
and Benny and more. Going on, the lecture will describe the relationship between VXers and the
AntiVirus companies, even it does not seem so, there is a connection between both groups. [...]
9;±7KH9LUXV
8QGHUJURXQG
)U(LQVWHLJHU
0DUFHOO'LHWO
6FKZLHULJNHLWVJUDG
,PPHUZLHGHUOLHVWPDQLQGHQ0HGLHQYRQQHXHQ9LUHQXQG
:UPHUQZHOFKHLQ8PODXINRPPHQXQGJURHQ6FKDGHQ
DQULFKWHQ'RFKNDXPHLQHUZHLGDVVHVHLQHNOHLQH*UXSSHYRQ
/HXWHQJLEWZHOFKHVRJHQDQQWH0DOZDUHSURJUDPPLHUWGDVLHHV
DOVHLQH.XQVWGHV+DFNLQJVDQVLHKW'LHVHU*UXSSHYRQ0HQVFKHQ
XQGGHUHQ,GHRORJLHZROOHQZLUXQVLP)ROJHQGHQZLGPHQ
:
HQQ GLH 6FKODJ]HLOHQ LQ =HLWVFKULIWHQ 6LH GDV QlFKVWH 0DO YRQ HLQHP QHXHQ 9LUXV
XQG )HUQVHKHQ ZLHGHU YRQ HLQHP RGHU:XUPOHVHQZHOFKHUZLHGHUHLQPDOQXUDXV
QHXHQ9LUXVRGHU:XUPJHSUlJWVLQG *HOGJLHUHQWVWDQGHQLVW
OlXIWHVYLHOHQNDOWEHUGHQ5FNHQVLHEHIUFK
WHQ HLQH QHXH ,QIHNWLRQVZHOOH XQG HQWZLFNHOQ :LFKWLJH*UXSSHQGHU6]HQH
HLQHQ=RUQDXIGLH$XWRUHQVROFKHU3URJUDPPH 6SlWHUZHUGHQZLUQRFKJHQDXHUGLH3UREOHPH
ZHOFKHZLULP:HLWHUHQDOV0DOZDUHNODVVL¿]LHUHQ GHU9;6]HQHEHWUDFKWHQGRFKKLHUVHLVFKRQ
ZHUGHQ.DXPHLQHUEHGHQNWGDEHLMHGRFKGDVV
VROFKH3URJUDPPHDXFKGXUFKDXVDXVSRVLWLYHQ
%HZHJJUQGHQ HQWVWHKHQ N|QQHQ /HLGHU ZLUG ,QGLHVHP$UWLNHOHUIDKUHQ
GDV RIW QLFKW HUZlKQW GD QXU VROFKH 0DOZDUH 6LH
GLH6FKODJ]HLOHQIOOWZHOFKHPHLVWDXV3UR¿WJLHU
HQWVWHKW 'LH NOHLQH 6]HQH YRQ YLHOOHLFKW IQI]LJ :LHGLH9LUHQV]HQHDXIJHEDXWLVW
:HOFKH )DFKEHJULIIH HV EHL 9LUHQVFKUHLEHUQ
/HXWHQZHOFKHVLFK]XP=LHOJHVHW]WKDWLPPHU
JLEW
QHXHXQGNUHDWLYH3URJUDPPH]XHUVWHOOHQOHLGHW
:HOFKHZLFKWLJHQ3HUVRQHQ*UXSSHQHWFHVLQ
XQWHU GLHVHU QHJDWLYHQ 9HUDOOJHPHLQHUXQJ XQG
GHU6]HQHJLEW
ZLUGLQYLHOHQ/lQGHUQPLWWOHUZHLOHJHQDXVREH :HOFKH9HUELQGXQJ]ZLVFKHQ9LUHQVFKUHLEHUQ
NlPSIWZLHGLH.ULPLQHOOHQVHOEVW,PQDFKIROJHQ XQG$QWL9LUHQ)LUPHQEHVWHKW
GHQ $UWLNHO ZROOHQ ZLU XQV MHGRFK JHQDX GLHVHU XQGYLHOHVPHKU
*UXSSH YRQ 0HQVFKHQ ZLGPHQ XQG OHUQHQ LKUH
'HQNZHLVH]XYHUVWHKHQXQGHUNHQQHQGDVV9L
UHQGXUFKDXVDXFKHLQH.XQVWGHV+DFNLQJVHLQ :DV6LHYRUKHUZLVVHQN|QQHQ
N|QQHQ ZHQQ DXFK HLQH NRPSOHWW DQGHUH DOV VROOWHQ
YLHOH GHQNHQ /DVVHQ 6LH XQV DOVR HLQHQ %OLFN
DXIGLHVHNOHLQH*UXSSHLKUH,GROHLKUH,GHRORJLH $XHU HLQHP ,QWHUHVVH DQ GHU 0DWHULH LVW NHLQ
XQG 6]HQHVWUXNWXU ZHUIHQ XQGYLHOOHLFKWZHUGHQ 9RUZLVVHQQ|WLJ
6LH,KUH0HLQXQJDXFKHLQZHQLJlQGHUQZHQQ
KDNLQ1U ZZZKDNLQRUJGH
152 24C3
24. Chaos Communication Congress
9;±7KH9LUXV8QGHUJURXQG
HLQPDOHUZlKQWGDVVHLQHVGHU3UR 1DWUOLFK NDQQ LFK ,KQHQ KLHU QXU N|QQWH PDQ GLH 9LUHQVFKUHLEHU
EOHPH HLQH VWlQGLJH 9HUlQGHUXQJ HLQHQ JUREHQ hEHUEOLFN JHEHQ GD LQ ]ZHL JURH *UXSSHQ HLQWHLOHQ
GHU *UXSSHQ XQG 6]HQHNRQVWHOOD HV VWlQGLJ ]X 9HUlQGHUXQJHQ GHU GLH +REE\LVWHQ XQG GLH ,GHRORJHQ
WLRQLVW6RJLEWHVVWlQGLJQHXH9L 6]HQHVWUXNWXU NRPPW DEHU ]XPLQ *lQJLJHU 'H¿QLWLRQ QDFK YHUVWHKW
UXVDXWRUHQXQG9LUXVJUXSSHQGRFK GHVW VLQG ,KQHQ MHW]W VFKRQ HLQLJH PDQ XQWHU GHU +REE\LVWHQ VROFKH
QXUZHQLJHEOHLEHQDXIOlQJHUH6LFKW %HJULIIH JHOlX¿J $OOH *UXSSHQ 3URJUDPPLHUHU ZHOFKH JHOHJHQW
JHVHKHQDNWLYZDVHVGXUFKDXVHU EHVWHKHQPHLVWDXV0LWJOLHGHUQYHU OLFK HLQHQ 9LUXV VFKUHLEHQ RGHU
VFKZHUW JHQDX IHVW]XOHJHQ ZHOFKH VFKLHGHQHU6WDDWHQVRLVWHWZDGHU DXFK QXU HLQ HLQ]LJHV ELV HLQLJH
*UXSSHQXQG$XWRUHQHVGHU]HLWJLEW /HDGHU $QIKUHU2UJDQLVDWRUGHV ZHQLJH 0DOH XP GDV DOV NUHDWLYH
XQGLQZLHZHLWGLHVHQRFKDNWLYVLQG DPHULNDQLVFKHQ 7HDPV VHOEVW QLFKW +HUDXVIRUGHUXQJ ]X VHKHQ 'DKHU
1LFKWVGHVWRWURW]P|FKWHLFK6LHKLHU DXV $PHULND MHGRFK ZXUGH GLHVH NRPPWDXFKGLHVWlQGLJH8PVWUXN
PLW HLQLJHQ 1DPHQ YRQ *UXSSHQ *UXSSH XUVSUQJOLFK LQ $PHULND WXULHUXQJGHU6]HQHGDHVVWlQGLJ
YHUWUDXW PDFKHQ GLH ]XPLQGHVW JHJUQGHW KDW VLFK MHGRFK PLWWOHU QHXH $XWRUHQ JLEW ZHOFKH MHGRFK
QDFK MHW]LJHP 6WDQG GHU 'LQJH DOV ZHLOH VWDUN JHZDQGHOW $XFK GDV QDFK ZHQLJHQ 9LUHQ PHLVW ZLHGHU
DNWLY JHOWHQ N|QQHQ 2E VLH HV LQ (2)3URMHFW EHVWHKW DXV 0LWJOLH YHUVFKZLQGHQ +REE\LVWHQ VHKHQ
=XNXQIW ZHLWHUKLQ EOHLEHQ LVW IUDJ GHUQ GHU YHUVFKLHGHQVWHQ 6WDDWHQ GDV 3URJUDPPLHUHQ HLQHV 9LUXV
OLFK /DVVHQ 6LH XQV PLW GHQ ]ZHL GLHVHU :HOW HLQH W\SLVFKH (LJHQDUW DOV LQWHUHVVDQWH +HUDXVIRUGHUXQJ
ZLFKWLJVWHQ *UXSSHQ DQIDQJHQ GLH GHU6]HQHGDHVPHLVWQLFKWOlQGHU DEHU ZLGPHQ VLFK VFKQHOO ZLHGHU
HLQH WUlJW GHQ 1DPHQ 5HDG\ 5DQ VSH]L¿VFK RUJDQLVLHUW LVW ,FK KRIIH DQGHUHQ 7KHPHQJHELHWHQ 'DQQ
JHUV /LEHUDWLRQ )URQW U5OI XQG LVW GDVV ,KQHQ GLHVHU $EVFKQLWW VFKRQ JLEW HV QRFK GLH ,GHRORJHQ 6LH
HLQH GHXWVFKH *UXSSH ZHOFKH HV HLQPDOHLQHQNOHLQHQ(LQEOLFNLQGLH SURJUDPPLHUHQ 9LUHQ XQG VRQVWLJH
VHLW QXQPHKU VLHEHQ -DKUHQ JLEW ZLFKWLJVWHQ*UXSSHQGHU6]HQHJH 0DOZDUH GD HV IU VLH HLQH NQVW
6LH KDEHQ YLHOH 9HU|IIHQWOLFKXQJHQ JHEHQKDW OHULVFKH +HUDXVIRUGHUXQJ GDUVWHOOW
LQGHQOHW]WHQ-DKUHQKHUYRJHEUDFKW 6LHZROOHQVWlQGLJQHXH7HFKQLNHQ
XQG HV LVW XQZDKUVFKHLQOLFK GDVV 'LH,GHRORJLHGHU HUOHUQHQXQGXPVHW]HQXQGVXFKHQ
VLH LQ QlFKVWHU =HLW LQDNWLY ZHUGHQ $XWRUHQ QHXH0|JOLFKNHLWHQHWZDQHXH9HU
'LH DQGHUH XQG ZRKO EHNDQQWHVWH ,P OHW]WHQ $EVDW] KDEHQ 6LH HLQH EUHLWXQJVZHJH 0DQ NDQQ VFKZHU
*UXSSH GHU JDQ]HQ 6]HQH LVW $ NOHLQH (LQIKUXQJ LQ HLQLJH ZLFK VDJHQ ZHU LQ ZHOFKH *UXSSH HLQ
ZDVGHQ+H[FRGHYRQGDUVWHOOW WLJH *UXSSHQ GHU 6]HQH HUKDOWHQ ]XRUGQHQ LVW PHLVW LVW GLH *UHQ]H
(V LVW HLQH LQWHUQDWLRQDOH *UXSSH QXQ ZROOHQ ZLU XQV GHU ,GHRORJLH UHFKW ÀLHHQG GRFK OlVVW HV VLFK
ZHOFKH HEHQVR YRU YLHOHQ -DKUHQ GHU $XWRUHQ ZLGPHQ GHQQ QLFKW JDQ]JXW]HLWOLFKIHVWOHJHQ:HUODQ
]X %HJLQQ GHV 9; 9LUXV &RGLQJ DOOH SURJUDPPLHUHQ 9LUHQ DXV GHQ JH LQ GHU 9LUHQV]HQH ZDU XQG LP
JHJUQGHW ZXUGH 6LH EUDFKWH YLHOH JOHLFKHQ *UXQGVlW]HQ KHUDXV XQG PHUDNWLYEOLHEJHK|UWHKHU]XGHQ
9HU|IIHQWOLFKXQJHQ KHUDXV XQG LVW HV JLEW HQRUPH 8QWHUVFKLHGH 1D ,GHRORJHQ DOV GHQ +REE\LVWHQ $OV
DXFK KHXWH QRFK DNWLY MHGRFK KDW WUOLFK N|QQWH PDQ KLHU DOV HUVWHV HLQ %HLVSLHO VHL KLHU GHU $XWRU 5R\
VLFK GLH 6WUXNWXU OHLFKW YHUlQGHUW VROFKH QHQQHQ ZHOFKH 0DOZDUH * %LY YRQ $ JHQDQQW ZHOFKHU
8P ,KQHQ HLQH JXWH %DVLV ]X JH QXU HUVWHOOHQ XP GDPLW *HOG ]X KHXW]XWDJH DOV HLQHU GHU OlQJVWHQ
EHQ VHLHQ KLHU QRFK GLH *UXSSHQ YHUGLHQHQ 6HL HV ZHLO VLH HWZDV XQGDNWLYVWHQLQGHU6]HQHWlWLJLVW
'RRP5LGHU] DXV $PHULND ZR HV 9&.V 9LUXV &UHDWLRQ .LWV ZHL (U JHK|UW HLQGHXWLJ ]X GHQ ,GHROR
PLWWOHUZHLOH DOV 9HUEUHFKHQ JLOW 9L WHUYHUNDXIHQ RGHU ZHLO VLH 5HFK JHQ (U LVW ]ZDU QLFKW GHU HLQ]LJH
UHQ ]X VFKUHLEHQ 3XUJDWRU\ DXV QHU LQ¿]LHUHQ XP VHQVLEOH 'DWHQ GRFKHLQVHKUJXWHV%HLVSLHOGDIU
GHP ,UDQ ) /DEV XQG GDV (2) DXV]XVSlKHQ 'RFK GHQHQ ZROOHQ (LQHQHQWVFKHLGHQGHQ8QWHUVFKLHG
3URMHFWQHW ZHOFKHV LFK LP -DKUH ZLU XQV JDU QLFKW ZHLWHU ZLGPHQ JLEW HV VFKOLHOLFK QRFK QlPOLFK
JUQGHWH XQG GDV VHLW GHQQ PLW GLHVHQ .ULPLQHOOHQ KDW 9;HU ZHOFKH LKUH 9LUHQ LQ GLH
XQWHU QHXHU /HLWXQJ VWHKW JHQDQQW HLQ 9;HU QLFKWV ]X WXQ $QVRQVWHQ :LOGEDKQ DXVVHW]HQ XQG VROFKH
ZHOFKH 9LUHQ ]ZDU YHU|IIHQWOLFKHQ
MHGRFK QLFKW XP GDPLW GHVWUXNWLY
6FKDGHQ DQ]XULFKWHQ :UGH PDQ
HVPLW+DFNHUQYHUJOHLFKHQN|QQWH
PDQ EHL GHQ PHLVWHQ 9;HUQ YRQ
:KLWHKDWV VSUHFKHQ GD VLH DOOHV
LP JXWH 6LQQH SURJUDPPLHUHQ XQG
EHL VROFKHQ ZHOFKH 6FKDGHQ DQ
ULFKWHQZROOHQYRQ%ODFNKDWV'RFK
GLHVQXUDOVNOHLQH3DUDOOHOH]XGHU
:HOWGHV+DFNLQJZDVMD9;DXFK
$EELOGXQJ'DV/RJRGHU*UXSSH5HDG\5DQJHUV/LEHUDWLRQ)URQW EHGLQJWLVW
ZZZKDNLQRUJGH KDNLQ1U
Volldampf voraus! 153
27. - 30. Dezember 2007, Berlin
)U(LQVWHLJHU
KDNLQ1U ZZZKDNLQRUJGH
154 24C3
24. Chaos Communication Congress
9;±7KH9LUXV8QGHUJURXQG
$EELOGXQJ'DV/RJRGHU'RRP5LGHU]*UXSSH
ZZZKDNLQRUJGH KDNLQ1U
Volldampf voraus! 155
27. - 30. Dezember 2007, Berlin
)U(LQVWHLJHU
EHQYRQ3URJUDPPHQLQ6NULSWVSUD VHOWHQHU JHQXW]WH 0|JOLFKNHLW LVW 1XU ZHQLJH 1XW]HU YRQ :LQGRZV
FKHQ %HNDQQWH %HLVSLHOH ZlUHQ HV HLQHQ 9LUXV ]XP %HLVSLHO LQ HL KDEHQ GLHVH GHDNWLYLHUW XQG VR
HWZD3+33HUO3\WKRQRGHU5XE\ QHU /RZ/HYHO 3URJUDPPLHUVSUDFKH LVW HV HLQ /HLFKWHV HLQHQ 9LUXV ]X
&RGHVZHOFKHLQVROFKHQ6SUDFKHQ ]X HUVWHOOHQ HWZD $VVHPEOHU XQG VFKUHLEHQZHOFKHUVLFKDXIHLQ0H
HUVWHOOWZHUGHQVLQGZLHGHULQVRZHLW GHQ &RGH VR ]X JHVWDOWHQ GDVV HU GLXP NRSLHUW XP YRQ GRUW DQGHUH
SODWWIRUPXQDEKlQJLJ GDVV VLH YRQ VLFKGHPMHZHLOLJHQ%HWULHEVV\VWHP 5HFKQHU ]X LQ¿]LHUHQ (LQGHXWLJHU
HLQHP 3URJUDPP GHP ,QWHUSUHWHU HQWVSUHFKHQG YHUKlOW 'HU $XI 1DFKWHLOGLHVHU0HWKRGHLVWHUQHXW
DXI GHP MHZHLOLJHQ %HWULHEVV\VWHP ZDQG LVW GHPQDFK HQRUP VROOWHQ GDVV HV HLQHV $QIDQJV EHGDUI
DXVJHIKUW ZHUGHQ (LQ 3URJUDPP 6LH HV GHQQRFK VHKU LQWHUHVVDQW DOVR HWZD HLQHP 1XW]HU ZHOFKHU
ZHOFKHV ]XP %HLVSLHO GLH )XQNWL ¿QGHQ XQG ZHLWHUH ,QIRUPDWLRQHQ DEVLFKWOLFK HLQH &' PLW GHP 9LUXV
RQ HQWKlOW GHQ DNWXHOOHQ 2UGQHU ZQVFKHQ GLH GLHVHQ $UWLNHO XQG EHVFKUHLEW XQG ZHLWHUUHLFKW ,Q
]X O|VFKHQ ZLUG GDEHL DXI MHGHP $EVFKQLWW VSUHQJHQ ZUGHQ VR QHXHUHU =HLW ZHUGHQ 86% 6WLFNV
6\VWHP ODXIHQ ZHOFKHV GHQ SDV HPSIHKOHLFKHLQH6XFKHQDFKGHP LPPHUPHKUJHQXW]WXQGHUVWH86%
VHQGHQ ,QWHUSUHWHU LQVWDOOLHUW KDW ZRKO EHNDQQWHVWHQ %HLVSLHOFRGH 6WLFNV HQWKDOWHQ HLQH $XWRVWDUW
:HQQ 6LH ,QWHUHVVH DQ VR HWZDV PLW GHP 1DPHQ :LQX[ HLQH $Q )XQNWLRQ YRQ 3URJUDPPHQ $XFK
KDEHQ VFKDXHQ 6LH VLFK ]XP %HL VSLHOXQJ DXI GLH .RPELQDWLRQ GHU GLHVH 0|JOLFKNHLW HLJQHW VLFK JXW
VSLHO GHQ VHKU HLQIDFKHQ %HLVSLHO :|UWHU:LQGRZVXQG/LQX[$QPHU XP GHQ HLJHQHQ 9LUXV VFKQHOO DQ
YLUXV &\DQRWLF DQ ZHOFKHQ LFK DXI NXQJ 'HU 9LUXV LVW XQWHU YHUVFKLH DQGHUH]XYHUEUHLWHQXQGGRUWHLQH
PHLQHU :HEVHLWH ZZZVPDVKWKH GHQHQ %H]HLFKQXQJHQ ]X ¿QGHQ DXWRPDWLVFKH $XVIKUXQJ ]X SUR
VWDFNQHW ]XU 9HUIJXQJ VWHOOH (U /LQX[3((OI ::LQX[ YR]LHUHQ 'RFK DXFK GLHV %HGDUI
LVW LQ -DYD JHVFKULHEHQ XQG PDFKW /LQX[:LQX[:/LQGRVH GHU 7DWVDFKH GDVV HLQ 86% 6WLFN
QLFKWV DQGHUHV DOV GLH 'DWHLHQ GHV HLQJHVWHFNW LVW XQG ZHLWHUJHUHLFKW
DNWXHOOHQ 2UGQHUV LQ GHP HU VLFK *HQXW]WH ZLUG :HOFKH 0|JOLFKNHLWHQ JLEW
EH¿QGHW PLW HLQHP 6WULQJ HLQ DXV 9HUEUHLWXQJVZHJH HVQRFKXQGZHOFKHVLQGYRUDOOHP
PHKUHUHQ =HLFKHQ EHVWHKHQGHU 'LH OHW]WHQ $EVFKQLWWH HUP|JOLFK KLÀUHLFKHU HLQHQ :XUP VFKQHOO ]X
6DW]]XEHUVFKUHLEHQ-HGHV6\V WHQ,KQHQQXQVFKRQHLQHQNOHLQHQ YHUEUHLWHQ" 'LHVH )UDJH OlVVW VLFK
WHP ZDV EHU -DYD YHUIJW NDQQ (LQEOLFN LQ GLH 9; 6]HQH XQG LKUH QLFKW HLQGHXWLJ EHDQWZRUWHQ MH
GLHVHQ 9LUXV DXFK ]XU $XVIKUXQJ (QWZLFNOXQJEHUGLHOHW]WHQ-DKUH GRFK ]HLFKQHQ VLFK HLQLJH 7UHQGV
EULQJHQHVEHGDUINHLQHUOHL$QSDV KLQZHJ0LWZDVZLUXQVELVKHUQLFKW DE GLH PDQ KLHU EHQHQQHQ NDQQ
VXQJHQ (LQH OHW]WH MHGRFK HKHU EHVFKlIWLJW KDEHQ LVW GDV 7KHPD ,Q GHQ OHW]WHQ -DKUHQ ZXUGHQ VR
ZLH W\SLVFKH 9LUHQ DXIJHEDXW VLQG JHQDQQWH 3HHU7R3HHU RGHU NXU]
XQGZHOFKH)DFKEHJULIIHRGHU7HFK 331HW]ZHUNH LPPHU EHOLHEWHU
QLNHQ KlX¿J HLQJHVHW]W ZHUGHQ 6LH HUODXEHQ HV EHOLHELJH 'DWHLHQ
'HP ZROOHQ ZLU XQV MHW]W ZLGPHQ XQG2UGQHUGHVHLJHQHQ5HFKQHUV
/DVVHQ 6LH XQV HLQHQ %OLFN GDUDXI DQGHUHQ ]XU 9HUIJXQJ ]X VWHOOHQ
ZHUIHQ ZHOFKH W\SLVFKHQ 9HUEUHL GDPLW GLHVH GDUDXI ]XJUHLIHQ N|Q
WXQJVZHJH 9;HU EHL LKUHQ 9LUHQ QHQ9LHOHVROFKHU7DXVFKSURJUDP
HLQVHW]HQ 1DWUOLFK JLEW HV PHKU PH HQWKDOWHQ 6WDQGDUGRUGQHU LQ
0|JOLFKNHLWHQ DOV GLH ZHOFKH LFK GHQHQGLHVH)UHLJDEHQJHVSHLFKHUW
,KQHQ KLHU YRUVWHOOHQ ZHUGH DEHU ZHUGHQ XQG VR LVW HV XQWHU 9;HUQ
HVLVW]XPLQGHVWHLQNOHLQHU(LQEOLFN VHKU EHOLHEW LKUHQ 9LUXV PLW LQWHU
LQ W\SLVFKH :XUPWHFKQLNHQ :LH HVVDQWHQ'DWHLQDPHQ]XYHUVHKHQ
ZLU YRUKLQ VFKRQ HUIDKUHQ KDEHQ ± HWZD HLQ &UDFN IU :LQGRZV
ZXUGHQ GLH HUVWHQ 9LUHQ QRFK DOV ± XQG LQ GLHVHP 2UGQHU DE]X
NOHLQH 3URMHNWH YRQ YRUZLHJHQG VSHLFKHUQ 'LHVH 0HWKRGH IXQNWL
6WXGHQWHQ JHVFKULHEHQ YHUEUHLWHW RQLHUW lKQOLFK EHL 7DXVFKVHLWHQ
ZXUGHQ VLH ]XPHLVW EHU )UHXQGH VR JHQDQQWHQ 6KDUHKRVWHUQ ZLH
XQG 'LVNHWWHQ 0DQ UHLFKWH GLH HWZD 5DSLGVKDUH RGHU lKQOLFKHQ
'DWHL XQWHUHLQDQGHU ZHLWHU XQG 0DQ VWHOOW VHLQHQ 9LUXV P|JOLFKVW
VR HQWVWDQGHQ GLH HUVWHQ :UPHU DQRQ\P RQOLQH JLEW LKP HLQHQ LQ
REZRKO VLH HV LP HLJHQWOLFK 6LQQH WHUHVVDQWHQ ,QKDOW XQG ZDUWHW ELV
QRFKQLFKWZDUHQ+HXWHJUHLIWPDQ HUVLFKYHUEUHLWHW0DQN|QQWHVLFK
DXIDQGHUH7HFKQLNHQ]XUFN(LQH QXQQRFKYRUVWHOOHQGDVVPDQDX
0|JOLFKNHLW GLH KlX¿J JHQXW]W WRPDWLVLHUWLQXQJHVFKW]WHQ)RUHQ
ZLUG VWHOOW HV GDEHL GDU YRQ GHU RGHU %ORJV :HUEXQJ IU GLHVH 'D
$EELOGXQJ'DV/RJRGHV $XWRVWDUW )XQNWLRQ HLQHU &' RGHU WHL PDFKW XQG KRIIW GDVV VR PHKU
LUDQLVFKHQ3XUJDWRU\9;7HDPV '9' 520 *HEUDXFK ]X PDFKHQ /HXWH GDV 3URJUDPP KHUXQWHUOD
KDNLQ1U ZZZKDNLQRUJGH
156 24C3
24. Chaos Communication Congress
9;±7KH9LUXV8QGHUJURXQG
ZZZKDNLQRUJGH KDNLQ1U
Volldampf voraus! 157
27. - 30. Dezember 2007, Berlin
)U(LQVWHLJHU
.RPPXQLNDWLRQXQWHU GHQ VLFKHUVWHQ JHK|UW LVW HV GRFK DXI (=LQHV DXFK DXI )RUHQ ]XUFN
9;HUQ GDV EHOLHEWHVWH (LQH +LHUDUFKLH JHJULIIHQGRFKKDWVLFKGLHVH7HFK
0LWWOHUZHLOH VROOWHQ 6LH VFKRQ HLQHQ JLEW HV GDEHL NDXP MHGHU )UHPGH QLN QRFK QLFKW JHQJHQG EHZlKUW
JURHQ(LQEOLFNLQGLH:HOWGHU9LUXV ZLUG PHLVW IUHXQGOLFK HPSIDQJHQ XQG ZLUG HKHU ZHQLJHU JHQXW]W :LH
SURJUDPPLHUXQJ JHZRQQHQ KDEHQ XQGQDFKVHLQHQ,QWHUHVVHQJHIUDJW PDQ VLHKW QXW]HQ DXFK 9;HU JDQ]
GRFK HLQ SDDU ZHQLJH $VSHNWH VWH +LOIH ZLUG PHLVW VFKQHOO XQG NRQNUHW QRUPDOH .RPPXQLNDWLRQVZHJH ZLH
KHQQRFKDXV'HQHQZROOHQZLUXQV JHERWHQ 'DV 8QGHUQHW VWHOOW HLQH MHGHU DQGHUH DXFK JUHLIHQ MHGRFK
QXQDEVFKOLHHQGQRFKZLGPHQ'D SHUIHNWH $QODXIVWHOOH IU MHGHQ ,QWH DXFK DXI V]HQHW\SLVFKH (LJHQDUWHQ
ZlUH]XPHLQHQGLH)UDJHZLH9;HU UHVVLHUWHQ GDU 9RQ DQGHUHQ &KDW ]XUFN
PLWHLQDQGHUNRPPXQL]LHUHQXQGLKUH WHFKQRORJLHQ ZLH 6,/& KDOWHQ VLFK
,QIRUPDWLRQHQ XQWHUHLQDQGHU DXV 9[HU ELVODQJ IHUQ DXFK ZHQQ GLHVH =XVDPPHQDUEHLW
WDXVFKHQ :LH GLH PHLVWHQ 6]HQHQ VLFKHUKHLWVWHFKQLVFKEHWUDFKWHWVLQQ ]ZLVFKHQ9;HUQXQG
LP,QWHUQHWEHWUHLEHQIDVWDOOH9LUHQ YROOHU HUVFKHLQHQ N|QQWHQ 'RFK DOO $QWL9LUHQ)LUPHQ
VFKUHLEHULKUHHLJHQHQ:HEVHLWHDXI GLHVH 0LWWHO HUVFKHLQHQ HUVW HLQPDO $XI GHQ HUVWHQ %OLFN PDJ GLHVH
GHQHQ6LHLKUH.XQVWLKUH9LUHQ]XU WULYLDO HV EOHLEW MHGRFK QRFK HLQH hEHUVFKULIW VHKU PHUNZUGLJ HU
6FKDXVWHOOHQXQGDQGHUHQ(LQEOLFNH OHW]WH XQG ZRKO GLH ZLFKWLJVWH .RP VFKHLQHQ :HOFKHQ *UXQG VROOWH
LQGHQ4XHOOFRGHHUP|JOLFKHQ,QGHQ PXQLNDWLRQVDUW RIIHQ GLH (=LQHV DXFK QXU HLQH GHU EHLGHQ 3DUWHLHQ
VHOWHQVWHQ)lOOHQOLHJHQGHQ4XHOOGD 'HU %HJULII EHGHXWHW QLFKWV DQGHUHV KDEHQ PLW GHU DQGHUHQ LQ LUJHQGHL
WHLHQGLHNRPSLOLHUWHQ9HUVLRQHQEHL DOV HOHNWURQLVFKHV 0DJD]LQ ,P QHU :HLVH ]XVDPPHQ]XDUEHLWHQ
GDVOLHJWDQGHU,GHRORJLHGLHVLHQXQ 1RUPDOIDOO LVW GDPLW HLQH DXI +70/ VLQG VLH GRFK GDV NRPSOHWWH *H
VFKRQNHQQHQJHOHUQWKDEHQ(VJHKW EDVLHUHQGH 'DWHLHQVDPPOXQJ JH JHQWHLO XQG NlPSIHQ LPPHU GDUXP
GDUXP :LVVHQ ]X YHUEUHLWHQ QLFKW PHLQWZHOFKH4XHOOFRGHVXQG7H[WH GDVV VLH GLH 2EHUKDQG JHZLQQHQ"
DEHU 6FKDGHQ DQ]XULFKWHQ (LQ HUV HQWKlOWXQGEHUGHQDNWXHOOHQ6WDQG 'RFKEHWUDFKWHWPDQGDVJDQ]HLQ
WHU$QODXISXQNWIU9;HUVWHOOWGDEHL HLQHU *UXSSH RGHU EHU GLH 6]HQH WHQVLYHUZLUGGHXWOLFKGDVVHVHLQH
GLH 6HLWH Y[QHWOX[RUJ PLWWOHUZHLOH DOOJHPHLQ LQIRUPLHUW 'LHVH (=LQHV 9HUELQGXQJ ]ZLVFKHQ EHLGHQ JLEW
YHUOLQNW GLHVH DXI Y[RUJXD GDU ZHUGHQ YRQ YHUVFKLHGHQHQ *UXS 6LHLVWEHL:HLWHPQLFKWPLWGHUXQWHU
ZHOFKH HLQ 5LHVHQDUFKLY DQ 7H[WHQ SHQ LQ UHJHOPlLJHQ $EVWlQGHQ 9;HUQ ]XHLQDQGHU ]X YHUJOHLFKHQ
XQG 4XHOOHQ HQWKlOW EHU GLH PDQ KHUDXVJHJHEHQ XQG ZHUGHQ GDEHL DEHU HV JLEW VLH (LQH (LJHQDUW YRQ
VLFKJUXQGOHJHQGHVELVVHKUVSH]L¿ LPPHU YRQ GHU JDQ]HQ 6]HQH PLW 9;HUQLVWHV]XP%HLVSLHOLKUH9LUHQ
VFKHV :LVVHQ DQHLJQHQ NDQQ 9LHOH 6SDQQXQJ HUZDUWHW GD LP 9RUIHOG LQDXVIKUEDUHU)RUPDOVRHWZDDOV
9;HU6HLWHQVLQGGLUHNWYRQGRUWDXFK PHLVWQLFKWEHNDQQWLVWZHOFKHQHX
H[H 'DWHL DQ $QWL9LUHQ )LUPHQ
YHUOLQNW :HLWHUKLQ OlVVW VLFK VDJHQ HQ9LUHQXQG7HFKQLNHQYHU|IIHQWOLFKW ]X VFKLFNHQ GDPLW GLHVH GLH 9LUHQ
GDVV GLH PHLVWHQ 6HLWHQ LQ (QJOLVFK ZHUGHQ,PPHUKlX¿JHUZLUGDQVWDWW XQWHUVXFKHQXQGHLQH'LDJQRVHDXI
JHKDOWHQ VLQG :LH DXFK LP %HUXIV
OHEHQ VWHOOW (QJOLVFK GLH ZLFKWLJVWH
6SUDFKH]XP,QIRUPDWLRQVDXVWDXVFK
hEHUGHQ$XWRU
8QWHUVHLQHP1LFNQDPHQ6N\2XWDUEHLWHW0DUFHOO'LHWOVHLWDQVHLQHU3URMHNW
GDU DXFK ZHQQ PDQFKH *UXSSHQ
VHLWHZZZVPDVKWKHVWDFNQHWDXIGHUHU3URJUDPPHXQG7H[WHUXQGXP,76HFXULW\
OlQGHUVSH]L¿VFK RUJDQLVLHUW VLQG 9; HLQJHVFKORVVHQ YHU|IIHQWOLFKW (EHQVR IKUW HU GRUW VHLQHQ %ORJ XQG LVW XQWHU
1lFKVWHV 0LWWHO GHU .RPPXQLNDWLRQ VN\RXW#VPDVKWKHVWDFNQHW ]X HUUHLFKHQ 'HU]HLW ELOGHW HU VLFK SHUV|QOLFK LP %H
VLQG (PDLOV ZHOFKH YRUZLHJHQG UHLFK,76HFXULW\LPPHUZHLWHUXQGVWUHEWQlFKVWHV-DKUHLQH/HKUHRGHUHLQ6WXGLXP
GDQQJHQXW]WZHUGHQZHQQ,QIRUPD ]XP,QIRUPDWLNHUDQ
WLRQHQ JH]LHOW XQWHUHLQDQGHU DXVJH
WDXVFKWZHUGHQHWZDEHUHLQQHXHV
3URMHNW RGHU HLQH 7HFKQRORJLH ZHO
FKH ELVKHU QRFK JHKHLP JHKDOWHQ ,P,QWHUQHW
ZHUGHQ VROO bKQOLFK YHUKlOW HV VLFK
KWWSZZZDQHW±'LH6HLWHGHUEHUKPWHQ*UXSSH$
KLHUPLWGLYHUVHQ,QVWDQW0HVVHQJHUQ
KWWSZZZUUOIGHYX±'LH6HLWHGHUGHXWVFKHQ5HDG\5DQJHUV/LEHUDWLRQ)URQW
ZLH,&4061XQG<DKRR(LQHVGHU
KWWSZZZGRRPULGHU]FRQU ± 'LH 6HLWH GHV DPHULNDQLVFKHQ 'RRP5LGHU]
ZRKO ZLFKWLJVWHQ .RPPXQLNDWLRQV 7HDPV
NDQlOH IU 9;HU VWHOOW GDV 8QGHUQHW KWWSZZZIUHHZHEVFRPSXUJDWRU\Y[±'LH9;6HLWHGHVLUDQLVFKHQ3XUJDWRU\
GDU (LQ DXI ,5& EDVLHUHQGHV &KDW 7HDPV
QHW]ZHUN LQ GHP GLH ZLFKWLJVWHQ KWWSZZZHRISURMHFWQHW±'DV(2)3URMHFW
9; &KDQQHOV DQ]XWUHIIHQ VLQG XQG KWWSY[HRISURMHFWQHW±'DV9;)RUXPGHV(2)3URMHFWV
EHU GLH GLH PHLVWHQ /HXWH (LQVWLHJ KWWSY[QHWOX[RUJ±=HQWUDOH$QODXIVWHOOHUXQGXP9LUHQ±PLW7H[WHQ4XHOOFRGHV
LQGLH6]HQHJHZLQQHQXQG.RQWDNWH XQG/LQNV
NQSIHQ ,5& LVW '$6 &KDWPHGLXP KWWSY[FKDRVRI¿FLDOZV±'DWHQVDPPOXQJUXQGXP9;XQG6HFXULW\
XQWHU9;HUQZHQQJOHLFKHVQLFKW]X
KDNLQ1U ZZZKDNLQRUJGH
158 24C3
24. Chaos Communication Congress
9;±7KH9LUXV8QGHUJURXQG
ZZZKDNLQRUJGH KDNLQ1U
Volldampf voraus! 159
27. - 30. Dezember 2007, Berlin
160 24C3
24. Chaos Communication Congress
162 24C3
24. Chaos Communication Congress
164 24C3
24. Chaos Communication Congress
Society
lecture
2007-12-29 14:00
Saal 2
de
Markus Schneider
Wahlchaos
Paradoxien des deutschen Wahlsystems
Wahlchaos beschäftigt sich mit Wahlverfahren aus mathematischer und politischer Sicht. So
wurden die Wahlen von 1998, 2002 und 2005 betrachtet und a-postpriori manipuliert und
ihre Auswirkungen diskutiert.
http://univis.uni-magdeburg.de/form?__s=2&dsc=anew/lecture_view&lvs=fgse/ipw/zentr/psy_0&an
onymous=1&founds=fgse/ipw/zentr/psy_0,fma/iag/zentr/comput,/linear,/mab,/oberse&nosearch=1&
ref=main&sem=2006s&__e=
Seite des Seminars aus dem Universitätsinformationssystem
166 24C3
24. Chaos Communication Congress
Q := P arteistimmenzahl
Gesamtstimmenzahl · Gesamtsitzzahl
Q
Q − Q
SP N1 , N2 , N3 , . . .
SP
Ni
Ni
Ni = i Ni = 2i − 1
• M
Q |Q − M | ≤ 1
168 24C3
24. Chaos Communication Congress
4
614 · 47.194.062 ≈ 310.000
70.500
•
•
•
•
170 24C3
24. Chaos Communication Congress
172 24C3
24. Chaos Communication Congress
Volldampf vorraus!
24. Chaos Communication Congress
Veranstaltungen
Volldampf voraus! 173
27. - 30. Dezember 2007, Berlin
Tag 1 - Saal 1
Opening Event
Welcome Keynote
Welcome to the Congress!
SkyTee, Jens Ohlig, Ingo Schwitters, Sebastian Velke 2007-12-27 11:30 Saal 1 en lecture Making
Steam-Powered Telegraphy
Wherein a League of Telextraordinary Gentlemen present the marvel of Telex on the global Internet -- driven by a steam engine
We have built and modified a steam-powered Telex machine and connected it to the new-fangled invention for modern telegraphy known as "the
Internet". We will present this steampunkish invention in form of a lecture, thus hoping to enlighten interested ladies and gentlemen on the principles of
steam engine physics, 5-bit Baudot encoding, and historic telegraphy in general.
Der Bundestrojaner
Die Wahrheit haben wir auch nicht, aber gute Mythen
Der Bundestrojaner wird von der politischer, juristischer und technischer Seite beleuchtet.
TOR
NEDAP-Wahlcomputer in Deutschland
Wir bringen Euch auf den neuesten Stand,
was den Einsatz der NEDAP-Wahlcomputer in Deutschland betrifft.
Design Noir
The seedy underbelly of electronic engineering
In contemporary Western society, electronic devices are becoming so prevalent that many people find themselves surrounded by technologies they find
frustrating or annoying. The electronics industry has little incentive to address this complaint; I designed two counter-technologies to help people defend
their personal space from unwanted electronic intrusion. Both devices were designed and prototyped with reference to the culture-jamming "Design Noir";
philosophy. The first is a pair of glasses that darken whenever a television is in view. The second is low-power RF jammer capable of preventing cell phones
or similarly intrusive wireless devices from operating within a user's personal space. By building functional prototypes that reflect equal consideration of
technical and social issues, I identify three attributes of Noir products: Personal empowerment, participation in a critical discourse, and subversion.
http://www.ladyada.net/make/wavebubble/
http://www.ladyada.net/make/tvbgone/
http://www.ladyada.net/pub/research.html
174 24C3
24. Chaos Communication Congress
Tag 1 - Saal 2
Verteilte Sicherheit
Zur Ordnung der Überwachung
Die Integration visueller Überwachungssysteme sowie die Verknüpfung militärischer und nicht-militärischer Verwendungen der Technologien verläuft
schleichend, aber stetig.
Cristian Yxen, Erdgeist, Denis Ahrens 2007-12-27 18:30 Saal 2 de lecture Hacking
Trecker fahrn
Vom Gefühl, einen offenen Bittorrent Tracker zu fahren
Bittorrent aus der Sicht derer, die die Infrastruktur machen und natürlich auch selber nutzen.
http://opentracker.blogs.h3q.com/ Das opentracker Blog
http://erdgeist.org/arts/software/opentracker Opentracker Projektseite
AnonAccess
Ein anonymes Zugangskontrollsystem
AnonAccess ist ein elektronisches System, welches anonymen Zugang nicht nur zu Hackerspaces ermöglicht.
http://www.das-labor.org/wiki/AnonAccess AnonAccess im Labor wiki
Tag 1- Saal 3
Freifunkerei
And a Do-It-Yourself society against the state.
The term Freifunk Firmware has found a place on the shelf's in the life of numerous people. It has become an immense knot of activities, not just sitting
silently like a dusty heirloom. "Freifunkerei"; has become an example of how DIY-cultures can act and re-create alternatives in a world which seems both
confronted and abandoned by the state.
176 24C3
24. Chaos Communication Congress
Desperate House-Hackers
How to Hack the Pfandsystem
Wie funktionieren eigentlich diese Pfandflaschenrücknahmeautomaten? Wir finden es heraus.
Cybercrime 2.0
Storm Worm
Not only the Web has reached level 2.0, also attacks against computer systems have advanced in the last few months: Storm Worm, a peer-to-peer based
botnet, is presumably one of the best examples of this progress. Instead of a central command & control infrastructure, Storm uses a distributed
communication channel based on Kademlia / Overnet. Furthermore, the botherders use fast-flux service networks (FFSNs) to host some of the content.
FFSNs use fast-changing DNS entries to build a reliable hosting infrastructure on top of compromised machines. Besides using the botnet for DDoS attacks,
the attackers also send lots of spam - most often stock spam, i.e., spam messages that advertize stocks. This talk presents more information about Storm
Worm and the other aspects of modern cybercrime.
http://honeynet.org/papers/ff/ Fast-Flux Service Networks
http://honeyblog.org my blog
VX
The Virus Underground
The listeners will be introduced in the world of virus coding. They will understand how this can be seen as a way of expressing yourself and why it is a way
of hacking. Furthermore they will get to know, which important groups, authors and viruses have been there in the last years and which are still active
nowadays. Important technical terms will be explained as well as trends of the last years and the future. And more.
http://vx.netlux.org/ Virus database http://vxchaos.official.ws/ VX File Server
http://www.smash-the-stack.net Smash-The-Stack http://www.freewebs.com/purgatory-vx/ Purgatory Virus Team
http://www.eof-project.net/ EOF-Project http://vx.eof-project.net/
http://vx.netlux.org/ VX http://www.29a.net/ 29A Labs
http://www.rrlf.de.vu/ Ready Rangers Liberation Front http://vxchaos.official.ws/ VX CHAOS File Server
http://www.doomriderz.co.nr/ Doomriderz VX Team
Tag 2 - Saal 1
Christian Kurtsiefer, Ilja Gerhardt, Antia Lamas 2007-12-28 14:00 Saal 1 en lecture Science
Why Silicon-Based Security is still that hard: Deconstructing Xbox 360 Security
Console Hacking 2007
The Xbox 360 probably is the video game console with the most sophisticated security system to date. Nevertheless, is has been hacked, and now Linux can
be run on it. This presentation consists of two parts.
http://www.free60.org/ Free60 Project
Constanze Kurz, Frank Rosengart, Andreas Lehner 2007-12-28 17:15 Saal 1 de lecture Community
Chaos Jahresrückblick
Ein Überblick über die Aktivitäten des Clubs 2007
Wir stellen die Aktivitäten des und Geschehnisse im Chaos Computer Club im abgelaufenen Jahr vor. Hierunter fallen sowohl die Kampagnen des CCC, die
Lobbyarbeit sowie Berichte und Anekdoten von Veranstaltungen innerhalb des CCC als auch Vorträge und Konferenzen, an denen CCC-Vertreter
teilgenommen haben.
DIY Survival
How to survive the apocalypse or a robot uprising
The apocalypse could happen any day. You're going to need things to survive and your going have to make them yourself.
Andreas Bogk, tina, Erdgeist, nibbler 2007-12-28 00:00 Saal 1 en contest Culture
Rule 34 Contest
There is porn of it.
Rule 34 says: there is porn of it. This contest will challenge the best and brightest to prove the rule under adverse circumstances in a race against the
clock.
178 24C3
24. Chaos Communication Congress
Tag 1 - Saal 2
Absurde Mathematik
Paradoxa wider die mathematische Intuition
Ein kleiner Streifzug durch die Abgründe der Mathematik. Eigentlich ist der Mensch mit einer recht gut funktionierenden Intuition ausgerüstet. Dennoch
gibt es Paradoxa, welche mathematisch vollkommen korrekt und beweisbar sind, jedoch unserer Intuition widersprechen. Der Vortrag bietet einen
Streifzug durch einige dieser Paradoxa, die kurz und anschaulich erklärt werden.
Linguistic Hacking
How to know what a text in an unknown language is about?
It is sometimes necessary to know what a text is about, even it is written in a language you don't know. This can be quite problematic, if you do not even
know in what language it is written. This talk will show how it is possible to identify the language of a written text and get at least some information
about the contents, in order to decide whether a specialist and which specialist is needed to know more.
To be or I2P
An introduction into anonymous communication with I2P
I2P is a message-based anonymizing network. It builds a virtual network between the communcation endpoints. This talk will introduce the technical
details of I2P and show some exemplary applications.
http://www.i2p.net/ I2P website
Ralph Kusserow, Christine Ketzer, Yvette Krause 2007-12-28 23:00 Saal 2 de movie Society
Das Panoptische Prinzip - Filme über die Zeit nach der Privatsphäre
Ergebnisse des Minutenfilmwettbewerbs des C4 und des Kölner Filmhauses
In den letzten Jahrennicht zuletzt seit dem 11. Septemberist es zu einem Abbau von Bürgerrechten und einer immer umfassender werdenden Überwachung
seitens des Staates, aber auch der Wirtschaft gekommen. Erkennungsdienstliche Verfahren wie z. B. die Abnahme von Fingerabdrücken oder andere
biometrische Verfahren, treffen zunehmend auch Normalbürger. Das rechtsstaatlich garantierte Paradigma der Unschuldsvermutung wird demontiert:
Jeder ist potenziell verdächtig.
http://www.panoptisches-prinzip.de/ Das panoptische Prinzip
Tag 2 - Saal 3
180 24C3
24. Chaos Communication Congress
Hacking SCADA
how to own critical infrastructures
SCADA acronym stand for "Supervisory Control And Data Acquisition";, and it's related to industrial automation inside critical infrastructures. This talk will
introduce the audience to SCADA environments and its totally different security approaches, outlining the main key differences with typical IT Security
best practices. We will analyze a real world case study related to Industry. We will describe the most common security mistakes and some of the direct
consequences of such mistakes to a production environment. In addition, attendees will be shown a video of real SCADA machines reacting to these attacks
in the most "interesting"; of ways! :)
http://conference.hitb.org/hitbsecconf2007kl/materials/D1T2%20-%20Raoul%20Chiesa%20and%20Mayhem%20-%20Hacking%20SCADA%20-%20How
%20to%200wn%20Critical%20National%20Infrastructure.pdf Our slides @hitb07
C64-DTV Hacking
Revisiting the legendary computer in a joystick
The C64-DTV is a remake of the classic homecomputer sold as a joystick-contained videogame. The talk gives an overview about the structure of the dtv,
and showes different hardware and software modifications that can be done.
Tag 3 - Saal 1
Tomislav Medak, Toni Prug, Marcell Mars 2007-12-29 12:45 Saal 1 en lecture Society
Sex 2.0
Hacking Heteronormativity
Der lange Schwanz der Dating-Communities sowie die De- und Rekonstruktion von Geschlecht und sexueller Orientierung haben ungeahnte Auswirkungen
auf unser Sexualleben. Ein Überblick darüber, was Sex ist, wie Dating-Communities funktionieren und wie man zu einem erfüllten Sexualleben kommen
kann.
http://www2.gender.hu-berlin.de/gendermediawiki/index.php/Hauptseite Gender@Wiki
Hacker Jeopardy
Die ultimative Hacker-Quizshow
Das bekannte Quizformat - aber natürlich mit Themen, die man im Fernsehen nie zu sehen bekäme.
Tag 3 - Saal 2
182 24C3
24. Chaos Communication Congress
Hamburger Wahlstift
Am 24. Februar wollte Hamburg als Pilotprojekt mit dem Digitalen Wahlstift wählen.
http://www.24-februar.de/ Werbeseite zur Wahl
Wahlchaos
Paradoxien des deutschen Wahlsystems
Wahlchaos beschäftigt sich mit Wahlverfahren aus mathematischer und politischer Sicht. So wurden die Wahlen von 1998, 2002 und 2005 betrachtet und
a-postpriori manipuliert und ihre Auswirkungen diskutiert.
http://univis.uni-magdeburg.de/form?__s=2&dsc=anew/lecture_view&lvs=fgse/ipw/zentr/psy_0&anonymous=1&founds=fgse/ipw/
zentr/psy_0,fma/iag/zentr/comput,/linear,/mab,/oberse&nosearch=1&ref=main&sem=2006s&__e=
Seite des Seminars aus dem Universitätsinformationssystem
Space Communism
Communism or Space first?
Following "Chaos und Kritische Theorie" from 23C3, another verbal battle: Oona Leganovic (aka Ijon Tichy) will promote the idea to sublate the capital
relation and bring about communism first and only then to go to Space, because otherwise the earthly problems will be spread everywhere. Daniel Kulla
(impersonating Captain Kathryn Janeway) will, on the other hand, defend the exploration humanism that once already ended the middle ages and of
which can be expected to do the same to the crusted planetary commodity circus.
http://events.ccc.de/camp/2007/Fahrplan/events/1856.en.html "Weltraumkommunismus" auf dem Camp '07
http://dewy.fem.tu-ilmenau.de/CCC/CCCamp07/video/m4v/cccamp07-de-1856-Weltraumkommunismus.m4v
Videomitschnitt vom Camp (m4v, 144 MB)
Tag 3 - Saal 3
Introduction in MEMS
Skills for very small ninjas
MicroElectroMechanical Systems or MEMS are as part of micro system technology, systems with electrical and mechanical subsystems at the micro scale. It
is basically an introduction in the technology and in its potential for hardware hacks and potential ways of homebrew devices.
184 24C3
24. Chaos Communication Congress
haXe
hacking a programming language
haXe is a programming language for developing both server AND client side of a website. haXe can do Javascript/AJAX, Database access and even Flash and
video streaming. All with one single programming language.
http://haxe.org haXe website
http://nekovm.org neko website
http://haxe.org/hxasm hxASM website
http://haxevideo.org haxeVideo website
Tag 4 - Saal 1
Wearables of the electronic and digital ages and the female cyborg
Historians of technology usually argue that in the mediation of technology, female icons served two purposes: firstly, attracting the male buyer as erotic
signals; secondly, representing the simplicity of a technology`s handling. This scheme is obviously too simple and in itself stereotyped. It neglects the
nuances of how women are envisioned in relation to what technologies and what this means for both the semiotics of a technology and the identities of
women. For the case of the portable electronics, I will demonstrate such nuances. E.g. the radio was connected to female users as long as it served
leisurable entertainment in public spaces.
However, when marketed as an information tool back home or on business tours, it was put in male hands. Furthermore, the popular ascriptions which
condensed in the visions of media, advertising and manuals, also materialized in the artifacts themselves. Thus, radios or cell phones which were targeted
explicitly at women had feminized designs, colours and features which should relate to their life experiences. In my talk, I will also include this dimension
of the artifacts, analyzing them as frozen envisions of social and cultural values.
Closing Event
Tag 4 - Saal 2
186 24C3
24. Chaos Communication Congress
Tag 4 - Saal 3
OOXML
A twelve euros campaign against Microsoft's Office broken standard
Microsoft is currently trying to buy an ISO stamp for their flawed Office OpenXML (OOXML) specification.
http://www.noooxml.org/ Say NO to Microsoft Office broken standard
Tagungsband
ISBN 978-3-934-63606-4
90000 > books-on-demand.de
9 783934 636064