What is primitive data structure

Organizational matters. Algorithms and Data Structures (for ET / IT) program today. What are primitive data types? Primitive data types

Transcript

1 Organizational Algorithms and Data Structures (for ET / IT) Summer semester 2017 Dr. Stefanie Demirci Computer Aided Medical Procedures Technical University of Munich Tutor question hours (Start: Today) Tuesday, 3: 00-5: 00 pm, room N1039 Thursday, 8: 00-9: 30, room 0999 New discussion forum Piazza polls possible! 2 program today What are primitive data types? 1 Introduction 2 Basics of algorithms 3 Basics of data structures Primitive data types and number representation Fields as a sequential list Characters and strings Primitive data types We refer to basic data types built into programming languages ​​as primitive data types. Compound data types can be formed by combining primitive data types. Examples of primitive data types in C: int for whole numbers float for floating point numbers bool for logical values ​​3 4

2 bits and bytes Bits and bytes Bit 7 Bit 0 Bit 7 Bit 0 1 byte = 8 bit bytes as a unit of measurement for memory sizes (according to IEC, traditional): 2 10 bytes = 1024 bytes = 1 KiB, one kilo byte (Kibi byte) 2 20 bytes = 1 MiB, one mega byte (or MebiByte) 2 30 bytes = 1 GiB, one giga byte (or GibiByte) 2 40 bytes = 1 TiB, one tera byte (or TebiByte) 2 50 bytes = 1 PiB , one Peta byte (or PebiByte) 2 60 bytes = 1 EiB, one Exa byte (or ExbiByte) 1 byte = 8 bit bytes as a unit of measurement for memory sizes (according to IEC, metric): 10 3 bytes = 1000 bytes = 1 kb , one kilo byte (large B) 10 6 bytes = 1 MB, one mega byte 10 9 bytes = 1 GB, one giga byte bytes = 1 TB, one tera byte bytes = 1 PB, one peta byte bytes = 1 EB, a Exa byte Note: Bits are also used as dimensions, e.g. 16 Mbit or 16 Mb (small b) Primitive data types in C-like languages ​​We consider in detail primitive data types for: 1 natural numbers (unsigned integers) 2 whole numbers (signed integers) 3 floating point numbers (floats) 7 8

3 Number representation Number representation Decimal system: Base x = 10 coefficients cn {0,1,2,3,4,5,6,7,8,9} Example: = Binary system: Base x = 2 coefficients cn {0,1} Example: = = Octal system: base x = 8 (= 2 3) coefficients cn {0,1,2,3,4,5,6,7} Example: = = hexadecimal system: base x = 16 (= 2 4) coefficients cn { 0,1,2,3,4,5,6,7,8,9, A, B, C, D, E, F} Example: 7B 16 = B 16 0 = How many digits per number? Problem Given number z N, how many digits m are required with respect to base x? Solution Explanation: (a R) m = log x (z) +1 a = floor (a) = largest integer less than or equal to aa = ceil (a) = smallest integer greater than or equal to aa 1

4 Largest number per number of digits? Problem Given base x and m digits, what is the largest representable number? Solution examples: x = 2, m = 4: x = 2, m = 8: x = 16, m = 2: z max = xm 1 z max = = 15 = z max = = 255 = natural numbers in C-like Languages ​​Natural Numbers In computers, binary representation is used with a fixed number of digits (called bits). The primitive data types for natural numbers are: 8 bits (one byte), numbers that can be represented: {0, ..., 255} in C: unsigned char 16 bits, numbers that can be represented: {0, ..., 65535} in C: unsigned short 32 bits, representable numbers: {0, ...,} in C: unsigned long 64 bits, representable numbers: {0, ..., 2 64 1} in C: unsigned long long z max = = 255 = FF Negative numbers represented by 2-complement Example for 4 bits (representable numbers: 2 4 = 16): 2-complement representation I 2-complement representation Let x N, x> 0. The 2-complement representation xz of x using n bits is given by xz = 2 n x. This gives: 0000 = = = = = = = = = = = = = = = = -1 The previous example was: 5 = 1011, i.e. x = 5 and n = 4. Now: 5 z = = 16 5 = 11 = So the first bit is the sign! 15 16

5 2-complement representation II Let b n b n 1 ... b 1 be a bit sequence. (bnbn 1 ... b 1) z is the numerical value in 2-complement representation for positive numbers from 0 to 2 n 1 1 corresponds to (bnbn 1 ... b 1) z of the binary representation: (0b n 1 ... b 1) z = (0b n 1 ... b 1) 2 for negative numbers from 2 n 1 to 1 the following generally applies: (1b n 1 ... b 1) z = 2 n 1 + (0b n 1 ... b 1) 2 (bnbn 1 ... b 1) z = bn (2 n 1) + (bn 1 ... b 1) 2 properties of 2-complement For n N, we have () z = (2 n 1) + 2 n = 2 n 1 + (2 n 1 1) = 1 To get x from x in 2-complement representation: Form bit-wise complement and add 1st example: Negative of 6 = (0110) 2 with n = 4 6 = () z +1 = (1001) z +1 = (1010) z and back: 6 = () z +1 = (0101) z +1 = (0110) z Whole numbers in C-like languages ​​Whole numbers The primitive ones Data types for whole numbers are: 8 bits: unsigned char {0, ..., 255} signed char {128, ..., 127} 16 bits: unsigned short {0, ..., 65535} signed short {32768, ..., 32767} 32 bits: unsigned long {0, ..., 2 32 1} signed long {2 31, ..., 2 31 1} 64 bits: unsigned long long {0, ..., 2 64 1} signed long long {2 63, ..., 2 63 1} signed can be omitted (except for char!) Unsigned int and signed int are 16, 32 or 64 bit, depending on the system. Rational numbers I Fixed point representation: Comma on Fixed digit in number Example with n = 32: 32 1 integer part comma Disadvantages: fewer large numbers can be displayed fixed accuracy of the decimal places fractional part 19 20

6 rational numbers II 32 1 whole number fraction fractional part comma Interpretation for r Q: r = cn 2 n + ... + cccm 2 m with n digits before and m after the decimal point Example: = = = floating point numbers I Scientific notation: x = a 10 b for x R, where: a R with 1 a <10 b Z Examples: C Hz absolute zero clock frequency A8X processor Three components: sign mantissa a (determines the accuracy) exponent b (determines the size of the value range) Problem: with a fixed length of the mantissa (e.g. 3 digits) between = and = no number can be displayed! Floating point numbers II Floating point numbers III 1 bit 1 bit 11 bit 52 bit 8 bit 23 bit 32 bit float V exponent E mantissa M scientific representation with base 2 sign bit V f = (1) V (1 + M) 2 E bias 64 bit double mantissa M always has the form 1.abc, so the first digit is omitted (hidden bit) Exponent E is saved unsigned, shifted by bias with 32 bit: bias = 127, with 64 bit: bias = 1023 Usual floating point formats : Bit sign. Exponent mantissa valid decimal point. Representable range 32 1 bit 8 bit 23 bit 7 ± to ± bit 11 bit 52 bit 15 ± to ± bit 15 bit 64 bit 19 ± to ± In C: float (32 bit), double (64 bit), long double (80 Bit) 23 24

7 Be careful with the floating point! Definition of data structure Floating point numbers are convenient, but be careful! Many decimal numbers do not have a floating point representation Example: = (periodic) Due to the fixed length of the mantissa, many numbers cannot be represented either. Example: With 3 digits mantissa between = and = no number can be represented! Comparisons of floating point numbers are critical. Example: (== 0.3) is mostly FALSE! Interest calculations and the like NEVER use floating point numbers! Instead: special libraries such as GMP definition of data structure (according to Prof. Eckert) A data structure is a logical arrangement of data objects that represent information, enable access to the represented information via operations on data and manage the information. Two main components: data objects e.g. defines operations on the objects using primitive data types, e.g. defined as functions primitive data types in C program today natural numbers, e.g. unsigned short, unsigned long Value range: with n bits from 0 to 2 n 1 operations: +, -, *, /,%, <, ==,! =,> whole numbers, e.g. int, long Value range: with n bits from 2 n 1 to 2 n 1 1 Operations: +, -, *, /,%, <, ==,! =,> Floating point numbers, e.g. double, float Value range: depending on size Operations: +, -, *, /, <, ==,! =,> Logical values, bool Value range: true, false Operations: && ,,!, ==,! = 1 Introduction 2 Fundamentals of algorithms 3 Fundamentals of data structures Primitive data types and number representation Fields as a sequential list Characters and strings 27 28

8 Field definition Field as a sequential list Field definition A field A is a sequence of n data elements (di) i = 1, ..., n, with n N 0. A = d 1, d 2, ..., dn Die Data elements di are any data type (e.g. primitive). Examples: A are the natural numbers from 1 to 10, in ascending order: A = 1,2,3,4,5,6,7,8,9,10 If n = 0, the field is empty. Representation of field A as a sequential list (or array) fixed number n of data elements stored contiguously in linear order with index Access to i-th element via index i: A [i] ... Field A: A [n-1] A [n-2] A [2] A [1] A [0] Attention: Indexing usually starts at 0! Example sequential list Properties sequential list A [2] A [1] A [0] Field A: Field declaration in C (optional example): int A [3]; Access to elements: A [0] = 0; A [1] = 8; A [2] = A [1] + 7; // now: A [2] == 15 field A with length n as a sequential list (array) Advantages: direct access to elements in constant time using A [i] sequential iteration very simple Disadvantages: waste of memory if the list is not fully occupied Extending the sequential list is time-consuming Adding and deleting elements is time-consuming 31 32

9 Extending the sequential list Deleting an element from the list Given: Field A, length n + 1, as a sequential list Desired: Field A extended to length n + 2 Reserve new memory of size n + 2 Copy old list to new memory Given: Field A, length n, as a sequential list Desired: Delete element i from field A Remove element i Copy list elements to i ... Field A: A [n] A [n-1] A [2] A [1] A [0 ] New field A: A [n + 1] A [n] A [n-1] ... A [2] A [1] A [0] Insertion of element in list Outlook: Use of sequential lists Given: Field A, length n, as a sequential list Desired: Insert new element in field A at position i Copy list elements to i Insert element i in 2D and 3D images! 35 36

10 Program Today Bytes and ASCII 1 Introduction Interpretation of a byte as a character (instead of numbers) e.g. ASCII code 7 bit ASCII code: 2 Basics of algorithms 3 Basics of data structures Primitive data types and number representation Fields as sequential list Characters and character strings Code A..B..C..D..E..F 0 .. nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si 1 .. dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us 2 .. sp! # $% & () * +, -. /:; <=>? A B C D E F G H I J K L M N O 5 .. P Q R S T U V W X Y Z [\] ˆ 6 .. a b c d e f g h i j k l m n o 7 .. p q r s t u v w x y z {} del ASCII extensions, Unicode ASCII uses only 7 bits of a byte contains e.g. no umlauts (ä, ö, ü) or accents (é, ç) there are various extensions from ASCII to 8 bit in Europe ISO Latin-1 is common (ISO norm) occupies the codes of (or 80-FF in hex) Unicode was introduced as a 16-bit coding. The first 128 characters match ASCII, the next 128 characters match ISO Latin-1 Cyrillic, Arabic, Japanese characters UTF-8 is a multi-byte encoding of Unicode (1-6 bytes) Code length is encoded by the first bits Characters and strings Representation of an ASCII character in C: char Character literals in single quotation marks Examples: A, u, D char characters = A; Be careful with non-ascii characters! Representation of a string? (English: String) String literals in double quotes Example: AuD stored in C as a field (sequential list) of characters: '\ 0' 'D' 'u' 'A' Index 39 40

11 Summary 1 Introduction 2 Fundamentals of algorithms 3 Fundamentals of data structures Primitive data types and number representation Fields as a sequential list of characters and strings 41