<--Last Chapter | Table of Contents | Next Chapter--> |
Good programmers write good programs. Great programmers write good programs and good data structures. Organizing your data is as important as the program that crunches the data and produces a result.
Unfortunately, my experiences in the corporate world have taught me that that the only data structure used is the single dimensional array. When results are the only goal and more processing power is the cure for bad software design, arrays are easy to implement (they are built into Ada). Even the worst programmer knows how to use an array. And arrays are easy to understand. Try to use a linked list, and a programmer can get into trouble with his boss for using risky, "advanced" technology.
Alternatively, programmers will sometimes rely on the complexity and overhead of databases when a simplier solution using the correct data structure would be faster and easier to implement.
If you are lucky enough to work for a company that uses more than arrays, this chapter will discuss how to use other kinds of data structures in Ada.
Perhaps because of an oversight, Ada 95 with all its annexes has no equivalent to the C++ Standard Template Library. (Ada 2005 has a data strucutre library.) There are no standard packages providing common data structures. The Gnat compiler fills part of this void with packages for creating simple tables and hash tables.
The Booch components are a set of C++ objects created by Grady Booch. These were later ported to Ada 95. The components contain sets of general purpose data structures. The Booch components are available from AdaPower.Net or in RPM format from the Ada Linux Team. This is one popular choice for Ada's unofficial "Standard Template Library".
The components are organized into three main categories: tools, support and structs. The tools cover many items already implemented in the standard Ada or Gnat packages, such as searching, sorting and pattern recognition. Support refers to components that implement the tools and structs.
The structs (data structures) are the primary interest of Ada programmers. These are further subcategorized by the user's requirements: bounded (where the size is known at compile-time or there's no heap allocation), unbounded (using dynamic allocation and item caching), or the dynamic (a compromize between bounded and unbounded). The default if no others are available is unbounded.
Dynamic and unbounded types can specify a storage manager to use. The storage manager is a program that allocates memory. Use Global_Heap package if you have no preference.
Unbounded structures allocate memory whenever a new item is added to the structure.
Dynamic structures allocate memory in fixed-sized chunks. Each chunk is large enough for several items. The chunk size is set when the dynamic data structure is first created, but it can be altered at any time. When a chunk is full, the structure is grows by the size of another chunk. This reduces the number of memory allocations to improve performance.
Each dynamic structure includes these subprograms:
The Booch components are organzied in a hierarchy of packages. The BC package is the top-most package. BC defines the basic execptions that can be raised by the various components:
Container_Error : exception; Duplicate : exception; Illegal_Pattern : exception; Is_Null : exception; Lexical_Error : exception; Math_Error : exception; Not_Found : exception; Not_Null : exception; Not_Root : exception; Overflow : exception; Range_Error : exception; Storage_Error : exception; Synchronization_Error : exception; Underflow : exception; Should_Have_Been_Overridden : exception; Not_Yet_Implemented : exception;
The data structure components are:
Data Structure | Booch Packages | Description |
Bags | bc-containers-bags-bounded
bc-containers-bags-dynamic bc-containers-bags-unbounded |
Unordered collection of items. Duplicates are counted but not actually stored. |
Collections | bc-containers-collections-bounded bc-containers-collections-dynamic bc-containers-collections-unbounded |
Ordered collection of items. Duplicates are allowed and stored. |
Deques | bc-containers-deques-bounded bc-containers-deques-dynamic bc-containers-deques-unbounded |
Double-ended queues |
Single linked Lists | bc-containers-lists-single | A sequence of 0 or more items with a head and a pointer to each successive item. |
Double linked Lists | bc-containers-lists-double | A sequence of 0 or more items with a head and a pointer to both successive and previous items. |
Maps | bc-containers-maps-bounded bc-containers-maps-dynamic bc-containers-maps-unbounded |
A set with relationships between pairs of items. |
Queues | bc-containers-queues-bounded bc-containers-queues-dynamic bc-containers-queues-unbounded |
First in, first out list. |
Ordered (Priority) Queues | bc-containers-queues-ordered-bounded bc-containers-queues-ordered-dynamic bc-containers-queues-ordered-unbounded |
A sorted list, items removed from the front. |
Rings | bc-containers-rings-bounded bc-containers-rings-dynamic bc-containers-rings-unbounded |
A deque with only one endpoint. |
Sets | bc-containers-sets-bounded bc-containers-sets-dynamic bc-containers-sets-unbounded |
Unordered collection of items. Duplicates are not allowed. |
Stacks | bc-containers-stacks-bounded bc-containers-stacks-dynamic bc-containers-stacks-unbounded |
Last in, first out list. |
AVL Trees | bc-containers-trees-avl | Balanced binary trees |
Binary Trees | bc-containers-trees-binary-in_order bc-containers-trees-binary-post_order bc-containers-trees-binary-pre_order |
A list with two successors per item. |
Multiway Trees | bc-containers-trees-multiway-post_order bc-containers-trees-multiway-pre_order |
Tree with an arbitrary number of children. |
Directed Graphs | bc-graphs-directed | Groups of items with one-way relationships |
Undirected Graphs | bc-graphs-undirected | Groups of items with two-way relationships |
Smart Pointers | bc-smart | Access types that automatically deallocate themselves |
A definition of common data structures can be found at the National Institute of Standards and Technology.
The components are generic packages and must be instantiated for a particular type. They are arranged in hierarchies of generic packages. Each parent package must be instantiated before its child. For example, to use single linked lists (bc.containers.lists.single), bc.containers, bc.containers.lists, and bc.containers.lists.single must all be be created for the item type.
As with many component libraries, the Booch components represent all structures in memory, not in long-term storage. They cannot be used to create disk files, although the data could be saved to disk and reloaded later.
Containers are a controlled tagged record that encloses an item. The Booch components are composed of items stored in containers that are arranged in different ways.
To use any of the Booch components, a container must be instantiated to hold your item. For example, to create a new package to manage character in containers, use
package charContainers is new BC.Containers (Item => Character);
Iterators are created by New_Iterator in a data structure's package, but the subprograms that work with the iterator are defined in the Container package.
The Is_Done function indicates when all items have been traversed. When Is_Done is true, Current_Item is undefined. In other words, the program must loop through all items in the list, plus 1, before Is_Done is true.
Because an Iterator is a class-wide type, it must be assigned a new value when it is declared to avoid a compiler error.
i : charContainers.Iterator'class := charList.New_Iterator( customers );
First, a container must be defined to hold the item you want to store in your linked list.
package Containers is new BC.Containers (Item => Character);
Second, basic operations on lists must be instantiated.
package Lists is new Containers.Lists;
Finally, the single linked list package must be instantiated. For an unbounded package, you chose a storage pool to use. Single linked lists are always unbounded. Use Global_Heap if you have no preference.
package LS is new Lists.Single (Storage_Manager => Global_Heap.Pool, Storage => Global_Heap.Storage);
The single linked list package provides the following subprograms:
Notice that the term "Foot" refers to the last item in the list. The Ada string packages uses the term "Tail".
Here's an example:
with BC.Containers.Lists.Single; with Global_Heap; package customers is type aCustomer is record customerID : integer; accountBalance : float; end record; -- this is the item to put in the list package customerContainers is new BC.Containers (Item => aCustomer); -- create a new controlled tagged record container for customers package customerLists is new customerContainers.Lists; -- create a new list support package for our using container type package customerList is new customerLists.Single (Storage_Manager => Global_Heap.Pool, Storage => Global_Heap.Storage); -- create a single linked list package using the lists support -- customized for our container type end customers;
with ada.text_io, BC, customers; use ada.text_io, BC, customers; procedure list_demo is customers : customerList.Single_List; c : aCustomer; i : customerContainers.Iterator'class := customerList.New_Iterator( customers ); begin Put_Line( "This is a demo of the Booch components: single-linked lists" ); New_Line; -- The Newly Declared List Put_Line( "The list is newly declared." ); Put_Line( "The list is empty? " & customerList.Is_Null( customers )'img ); Put_Line( "The list is shared? " & customerList.Is_Shared( customers )'img ); Put_Line( "The list length is" & customerList.Length( customers )'img ); New_Line; -- Inserting a customer c.customerID := 7456; c.accountBalance := 56.74; customerList.Insert( customers, c ); Put_Line( "Added customer" & c.customerID'img ); Put_Line( "The list is empty? " & customerList.Is_Null( customers )'img ); Put_Line( "The list is shared? " & customerList.Is_Shared( customers )'img ); Put_Line( "The list length is" & customerList.Length( customers )'img ); c := customerList.Head( customers ); Put_Line( "The head item is customer id" & c.customerID'img ); c := customerList.Foot( customers ); Put_Line( "The foot item is customer id" & c.customerID'img ); New_Line; -- Apending a customer c.customerID := 9362; c.accountBalance := 88.92; customerList.Append( customers, c ); Put_Line( "Appended customer" & c.customerID'img ); Put_Line( "The list length is" & customerList.Length( customers )'img ); c := customerList.Head( customers ); Put_Line( "The head item is customer id" & c.customerID'img ); c := customerList.Foot( customers ); Put_Line( "The foot item is customer id" & c.customerID'img ); New_Line; -- Iterator example Put_Line( "Resetting the iterator.." ); customerContainers.Reset( i ); c := customerContainers.Current_item ( i ); Put_Line( "The current item is customer id" & c.customerID'img ); Put_Line( "Are we done? " & customerContainers.Is_Done( i )'img ); Put_Line( "Advancing to the next item..." ); customerContainers.Next( i ); c := customerContainers.Current_item ( i ); Put_Line( "The current item is customer id" & c.customerID'img ); Put_Line( "Are we done? " & customerContainers.Is_Done( i )'img ); Put_Line( "Advancing to the next item..." ); customerContainers.Next( i ); Put_Line( "Are we done? " & customerContainers.Is_Done( i )'img ); begin c := customerContainers.Current_item ( i ); exception when BC.NOT_FOUND => put_line( "BC.NOT_FOUND exception: no item at this position in the list" ); end; end list_demo;
This is a demo of the Booch components: single-linked lists The list is newly declared. The list is empty? TRUE The list is shared? FALSE The list length is 0 Added customer 7456 The list is empty? FALSE The list is shared? FALSE The list length is 1 The head item is customer id 7456 The foot item is customer id 7456 Appended customer 9362 The list length is 2 The head item is customer id 7456 The foot item is customer id 9362 Resetting the iterator.. The current item is customer id 7456 Are we done? FALSE Advancing to the next item... The current item is customer id 9362 Are we done? FALSE Advancing to the next item... Are we done? TRUE BC.NOT_FOUND exception: no item at this position in the list
Single linked lists should not be Guarded.
Double linked lists are useful for lists that must be browsed backwards and forwards continuously.
Double linked lists should not be Guarded.
The bags package provides the following subprograms:
Bags can be bounded, dynamic or unbounded.
Bags are implemented using a hash table. To declare a bag, a program must provide a hash function for storing items in the bag, and must indicate the size of the hash table.
Here's an example. Notice that some of the subprograms are in the Bags instantiation, and some in the Bags.Unbounded instantiation. Also notice the iterator moves over the items, but not the duplications:
with BC.Containers.Bags.Unbounded; with Global_Heap; package customers is type aCustomerID is new integer range 1_000..9_999; function IDHash( id : aCustomerID ) return Positive; -- our hash function package customerContainers is new BC.Containers (Item => aCustomerID); -- create a new controlled tagged record container for customers package customerBags is new customerContainers.Bags; -- create a new bag support for our using container type package customerBag is new customerBags.Unbounded( Hash => IDHash, Buckets => 99, Storage_Manager => Global_Heap.Pool, Storage => Global_Heap.Storage); -- create an unbounded bag package holding customer numbers end customers;
package body customers is function IDHash( id : aCustomerID ) return Positive is -- our hash function begin return Positive( id ); -- in this case, using the id is good enough end IDHash; end customers;
with ada.text_io, BC, customers; use ada.text_io, BC, customers; procedure bag_demo is customers : customerBag.Unbounded_Bag; c : aCustomerID; i : customerContainers.Iterator'class := customerBag.New_Iterator( customers ); begin Put_Line( "This is a demo of the Booch components: bags" ); New_Line; -- The Newly Declared Bag Put_Line( "The bag is newly declared." ); Put_Line( "The bag is empty? " & customerBag.Is_Empty( customers )'img ); Put_Line( "The bag extent is" & customerBag.Extent( customers )'img ); Put_Line( "The bag total size is" & customerBags.Total_Size( customers )'img ); New_Line; -- Inserting a customer c := 7456; customerBags.Add( customers, c ); Put_Line( "Added customer" & c'img ); Put_Line( "The bag is empty? " & customerBag.Is_Empty( customers )'img ); Put_Line( "The bag extent is" & customerBag.Extent( customers )'img ); New_Line; -- Inserting another customer c := 9362; customerBags.Add( customers, c ); Put_Line( "Added customer" & c'img ); Put_Line( "The bag is empty? " & customerBag.Is_Empty( customers )'img ); Put_Line( "The bag extent is" & customerBag.Extent( customers )'img ); Put_Line( "The bag total size is" & customerBags.Total_Size( customers )'img ); New_Line; -- Inserting duplicate customer c := 9362; customerBags.Add( customers, c ); Put_Line( "Added customer" & c'img ); Put_Line( "The bag is empty? " & customerBag.Is_Empty( customers )'img ); Put_Line( "The bag extent is" & customerBag.Extent( customers )'img ); Put_Line( "The bag total size is" & customerBags.Total_Size( customers )'img ); New_Line; -- Iterator example Put_Line( "Resetting the iterator.." ); customerContainers.Reset( i ); c := customerContainers.Current_item ( i ); Put_Line( "The current item is customer id" & c'img ); Put_Line( "Are we done? " & customerContainers.Is_Done( i )'img ); Put_Line( "Advancing to the next item..." ); customerContainers.Next( i ); c := customerContainers.Current_item ( i ); Put_Line( "The current item is customer id" & c'img ); Put_Line( "Are we done? " & customerContainers.Is_Done( i )'img ); Put_Line( "Advancing to the next item..." ); customerContainers.Next( i ); Put_Line( "Are we done? " & customerContainers.Is_Done( i )'img ); begin c := customerContainers.Current_item ( i ); exception when BC.NOT_FOUND => put_line( "BC.NOT_FOUND exception: no item at this position in the bag" ); end; end bag_demo;
This is a demo of the Booch components: bags The bag is newly declared. The bag is empty? TRUE The bag extent is 0 The bag total size is 0 Added customer 7456 The bag is empty? FALSE The bag extent is 1 Added customer 9362 The bag is empty? FALSE The bag extent is 2 The bag total size is 2 Added customer 9362 The bag is empty? FALSE The bag extent is 2 The bag total size is 3 Resetting the iterator.. The current item is customer id 7456 Are we done? FALSE Advancing to the next item... The current item is customer id 9362 Are we done? FALSE Advancing to the next item... Are we done? TRUE BC.NOT_FOUND exception: no item at this position in the bag
Bags are useful for counting the occurrences of an item in a large amount of data.
with BC.Containers.Sets.Bounded; with Global_Heap; package fruit_sets is -- my grandfather owned one of the largest fruit companies in the world type aFruit is ( Apples, Grapes, Peaches, Cherries, Pears, Plums, Other ); function FruitHash( f : aFruit ) return Positive; -- our hash function for the set package fruitContainers is new BC.Containers( item=> aFruit ); -- basic fruit container package fruitSets is new fruitContainers.Sets; -- basic set support package fruitBoundedSets is new fruitSets.Bounded( fruitHash, Buckets => 10, Size => 20 ); -- our actual set is an unbounded set end fruit_sets;
package body fruit_sets is function FruitHash( f : aFruit ) return Positive is begin return aFruit'pos( f )+1; -- good enough for this example end FruitHash; end fruit_sets;
with ada.text_io, kb_sets; use ada.text_io, kb_sets; procedure set_demo is use fruitSets; use fruitBoundedSets; s1 : Bounded_Set; s2 : Bounded_Set; s3 : Bounded_Set; begin Put_Line( "This is a demo of the Booch components: sets" ); New_Line; Add( s1, apples ); Add( s1, peaches ); Add( s2, apples ); Add( s2, peaches ); Add( s2, pears ); Put_Line( "Set 1 has apples and peaches." ); Put_Line( "Set 2 has apples, peaches and pears." ); New_Line; Put_Line( "Extent of set 1? " & Extent( s1 )'img ); Put_Line( "Extent of set 2? " & Extent( s2 )'img ); Put_Line( "Peaches in set 1? " & Is_Member( s1, peaches )'img ); Put_Line( "Pears in set 1? " & Is_Member( s1, pears )'img ); Put_Line( "Set 1 a subset of set 2? " & Is_Subset( s1, s2 )'img ); Put_Line( "Set 2 a subset of set 1? " & Is_Subset( s2, s1 )'img ); Put_Line( "Set 1 a subset of set 1? " & Is_Subset( s1, s1 )'img ); Put_Line( "Set 1 a proper subset of set 1? " & Is_Proper_Subset( s1, s1 )'img ); New_Line; s3 := s1; Union( s3, s2 ); Put_Line( "Set 3 is the union of set 1 and set 2" ); Put_Line( "Extent of set 3? " & Extent( s3 )'img ); end set_demo;
This is a demo of the Booch components: sets Set 1 has apples and peaches. Set 2 has apples, peaches and pears. Extent of set 1? 2 Extent of set 2? 3 Peaches in set 1? TRUE Pears in set 1? FALSE Set 1 a subset of set 2? TRUE Set 2 a subset of set 1? FALSE Set 1 a subset of set 1? TRUE Set 1 a proper subset of set 1? FALSE Set 3 is the union of set 1 and set 2 Extent of set 3? 3
The Collections package provides the following subprograms:
Collections are implemented as dynamically allocated arrays.
with BC.Containers.Collections.Dynamic; with Global_Heap; package products is type aProduct is record id : integer; weight : float; end record; package productContainers is new BC.Containers (Item => aProduct); -- this is the basic container package productCollections is new productContainers.Collections; -- create a new collection support for our using container type package productCollection is new productCollections.dynamic( Storage_Manager => Global_Heap.Pool, Storage => Global_Heap.Storage); -- create a dynamic collection holding products end products;
with ada.text_io, BC, products; use ada.text_io, BC, products; procedure collection_demo is products : productCollection.Dynamic_Collection; p : aProduct; i : productContainers.Iterator'class := productCollection.New_Iterator( products ); begin Put_Line( "This is a demo of the Booch components: collections" ); New_Line; products := productCollection.Create( 100 ); -- The Newly Declared Collection Put_Line( "The collection is newly declared with a chunk size of 100..." ); Put_Line( "The collection is empty? " & productCollection.Is_Empty( products )'img ); Put_Line( "The collection length is" & productCollection.Length( products )'img ); Put_Line( "The collection chunk size is" & productCollection.Chunk_Size( products )'img ); New_Line; -- Adding an Item p.id := 8301; p.weight := 17.0; productCollection.Append( products, p ); Put_Line( "Product id" & p.id'img & " was added..." ); Put_Line( "The collection is empty? " & productCollection.Is_Empty( products )'img ); Put_Line( "The collection length is" & productCollection.Length( products )'img ); Put_Line( "The collection chunk size is" & productCollection.Chunk_Size( products )'img ); p := productCollection.First( products ); Put_Line( "The first item is" & p.id'img ); p := productCollection.Last( products ); Put_Line( "The last item is" & p.id'img ); New_Line; -- Adding another Item p.id := 1732; p.weight := 27.0; productCollection.Append( products, p ); Put_Line( "Product id" & p.id'img & " was added..." ); Put_Line( "The collection is empty? " & productCollection.Is_Empty( products )'img ); Put_Line( "The collection length is" & productCollection.Length( products )'img ); Put_Line( "The collection chunk size is" & productCollection.Chunk_Size( products )'img ); p := productCollection.First( products ); Put_Line( "The first item is" & p.id'img ); p := productCollection.Last( products ); Put_Line( "The last item is" & p.id'img ); New_Line; -- Changing the Chunk Size productCollection.Set_Chunk_Size( products, Size => 1 ); Put_Line( "The chunk size was reduced to only 1..." ); Put_Line( "The collection is empty? " & productCollection.Is_Empty( products )'img ); Put_Line( "The collection length is" & productCollection.Length( products )'img ); Put_Line( "The collection chunk size is" & productCollection.Chunk_Size( products )'img ); p := productCollection.First( products ); Put_Line( "The first item is" & p.id'img ); p := productCollection.Last( products ); Put_Line( "The last item is" & p.id'img ); New_Line; -- Iterator example Put_Line( "Resetting the iterator.." ); productContainers.Reset( i ); p := productContainers.Current_item ( i ); Put_Line( "The current item is customer id" & p.id'img ); Put_Line( "Are we done? " & productContainers.Is_Done( i )'img ); Put_Line( "Advancing to the next item..." ); productContainers.Next( i ); p := productContainers.Current_item ( i ); Put_Line( "The current item is customer id" & p.id'img ); Put_Line( "Are we done? " & productContainers.Is_Done( i )'img ); Put_Line( "Advancing to the next item..." ); productContainers.Next( i ); Put_Line( "Are we done? " & productContainers.Is_Done( i )'img ); begin p := productContainers.Current_item ( i ); exception when BC.NOT_FOUND => put_line( "BC.NOT_FOUND exception: no item at this position in the collection" ); end;
Collections are suitable for small lists or lists where the upper bound is known or rarely exceeded.
This is a demo of the Booch components: collections The collection is newly declared with a chunk size of 100... The collection is empty? TRUE The collection length is 0 The collection chunk size is 100 Product id 8301 was added... The collection is empty? FALSE The collection length is 1 The collection chunk size is 100 The first item is 8301 The last item is 8301 Product id 1732 was added... The collection is empty? FALSE The collection length is 2 The collection chunk size is 100 The first item is 8301 The last item is 1732 The chunk size was reduced to only 1... The collection is empty? FALSE The collection length is 2 The collection chunk size is 1 The first item is 8301 The last item is 1732 Resetting the iterator.. The current item is customer id 8301 Are we done? FALSE Advancing to the next item... The current item is customer id 1732 Are we done? FALSE Advancing to the next item... Are we done? TRUE BC.NOT_FOUND exception: no item at this position in the collection
An ordered (or "priority") queue is a queue in which added items are sorted.
The queues package provides the following subprograms:
An ordered queue is identical except that append adds an item in sorted order.
Queues can be bounded, dynamic or unbounded.
Queues provide "fair" processing and reduce starvation.
The Stacks package provides the following subprograms:
Stacks can be bounded, dynamic or unbounded.
Stacks are used for temporary storage, compact representation and fast data access.
The Deques package provides the following subprograms:
Deques can be bounded, dynamic or unbounded.
In addition to the deque subprograms, rings include "Mark" to mark a point in the ring, "Rotate_To_Mark" to move the ring to the marked position, and "At_Mark" to test to see if the top of the ring is at the mark.
Rings can be bounded or dynamic.
The Maps package provides the following subprograms:
Maps are implemented with a hash table and caching.
Maps can be bounded, dynamic, unbounded or synchronized.
Maps are useful as translation tables.
Programs "walk" the tree by moving the root of the tree up and down the links to the items. Left_Child follows the left child link. Right_Child follows the right child link. Parent follows the parent link. Each of these subprograms can be used as a procedure (to move the root of the tree) or as a function (to examine the item the link connects to).
item := Item_At( tree ); Put( "Left child of " & item ) ; item := Item_At( Left_Child( tree ) ); Put_Line( " is " & item ) ;
When the root of the tree is moved, any items above the new root that aren't referenced anymore are destroyed. To move around the tree without destroying nodes (which is typically what you want to do), create an "alias" to the root of the tree with Create prior to moving.
root := Create( tree ); -- create a reference to the root Left_Child( tree ); -- safe: old root is not destroyed
Moving into an empty (null) position in the tree is allowed, but any attempt to look at the item there will raise an exception. The leaves and the parent of the root are empty.
The Trees.Binary package provides the following subprograms:
In addition, the tree may have an in_order, pre_order or post_order generic procedure. This procedure traverses the tree and executes processes each item. Pre_order processes an item before its children. Post_order processes an item after its children. In_order processes a node in the sort order of the tree--after all the left children but before all the right.
with BC.Containers.Trees.Binary.In_Order; with BC.Containers.Trees.Binary.Pre_Order; with BC.Containers.Trees.Binary.Post_Order; with Global_Heap; package shipment_binary is -- grandfather would be proud type aQuantity is ( Unknown, Basket_6Quart, Basket_11Quart, Bushel, Skid, Boxcar ); type aFruit is ( Apples, Grapes, Peaches, Cherries, Pears, Plums, Other ); type aShipment is record number : Positive; -- number of containers quantity : aQuantity; -- the containers contents : aFruit; -- type of fruit end record; procedure visitShipment( s : aShipment; OK : out boolean ); -- our tree traversal function package shipmentContainers is new BC.Containers( item=> aShipment ); -- basic fruit container package shipmentTrees is new shipmentContainers.Trees; -- basic tree support package shipmentBinaryTrees is new shipmentTrees.Binary( Storage_Manager => Global_Heap.Pool, Storage => Global_Heap.Storage ); -- our binary tree support procedure inOrdershipmentTraversal is new shipmentBinaryTrees.In_Order( visitShipment ); -- an in-order traversal procedure preOrdershipmentTraversal is new shipmentBinaryTrees.Pre_Order( visitShipment ); -- a pre-order traversal procedure postOrdershipmentTraversal is new shipmentBinaryTrees.Post_Order( visitShipment ); -- a post-order traversal end shipment_binary;
with ada.text_io; use ada.text_io; package body shipment_binary is procedure visitShipment( s : aShipment; OK : out boolean ) is -- our tree traversal function begin Put( "Shipment of" ); Put( s.number'img ); Put( " " ); Put( s.quantity'img ); Put( "(S) of " ); Put_Line( s.contents'img ); OK := true; end visitShipment; end shipment_binary;
with ada.text_io, shipment_binary; use ada.text_io, shipment_binary; procedure bintree_demo is use shipmentBinaryTrees; root : Binary_Tree; t : Binary_Tree; s : aShipment; OK : boolean; begin Put_Line( "This is a demo of the Booch components: binary trees" ); New_Line; -- this is the root item s.number := 5; s.quantity := basket_6quart; s.contents := cherries; Insert( t, s, Child => Left ); -- child doesn't really matter because there's no prior item at the root root := Create( t ); -- remember where the root is -- add to left of root s.number := 7; s.quantity := basket_11quart; s.contents := pears; Append( t, s, Child => Left, After => Left ); -- child doesn't really matter here -- add to right of root s.number := 12; s.quantity := bushel; s.contents := apples; Append( t, s, Child => Left, After => Right ); -- child doesn't really matter here Left_Child( t ); -- move "t" down left branch s.number := 3; s.quantity := skid; s.contents := peaches; Append( t, s, Child => Left, After => Right ); -- child doesn't really matter here Put_Line( "Our tree is: "); Put_Line( " 5 6 qt baskets of cherries" ); Put_Line( " |" ); Put_Line( " +----------------------------------------------------+" ); Put_Line( " | |" ); Put_Line( "7 11 qt baskets of pears 12 bushels of apples" ); Put_Line( " |" ); Put_Line( " +-------------------------------|" ); Put_Line( " 3 skids of peaches" ); New_Line; Put_Line( "In-order traversal:" ); inOrderShipmentTraversal( root, OK ); if not OK then Put_Line( "The traversal was interrupted" ); end if; New_Line; Put_Line( "Pre-order traversal:" ); preOrderShipmentTraversal( root, OK ); if not OK then Put_Line( "The traversal was interrupted" ); end if; New_Line; Put_Line( "Post-order traversal:" ); postOrderShipmentTraversal( root, OK ); if not OK then Put_Line( "The traversal was interrupted" ); end if; end bintree_demo;
This is a demo of the Booch components: binary trees Our tree is: 5 6 qt baskets of cherries | +----------------------------------------------------+ | | 7 11 qt baskets of pears 12 bushels of apples | +-------------------------------| 3 skids of peaches In-order traversal: Shipment of 7 BASKET_11QUART(S) of PEARS Shipment of 3 SKID(S) of PEACHES Shipment of 5 BASKET_6QUART(S) of CHERRIES Shipment of 12 BUSHEL(S) of APPLES Pre-order traversal: Shipment of 5 BASKET_6QUART(S) of CHERRIES Shipment of 7 BASKET_11QUART(S) of PEARS Shipment of 3 SKID(S) of PEACHES Shipment of 12 BUSHEL(S) of APPLES Post-order traversal: Shipment of 3 SKID(S) of PEACHES Shipment of 7 BASKET_11QUART(S) of PEARS Shipment of 12 BUSHEL(S) of APPLES Shipment of 5 BASKET_6QUART(S) of CHERRIES
Binary trees should not be Guarded.
The AVL package provides fewer subprograms than the binary tree package:
There are no subprograms for walking the tree.
Here is a sample declaration:
with BC.Containers.Trees.AVL; with Global_Heap; package fruit_avl is -- more fun with fruit type aQuantity is ( Unknown, Basket_6Quart, Basket_11Quart, Bushel, Skid, Boxcar ); type aFruit is ( Apples, Grapes, Peaches, Cherries, Pears, Plums, Other ); type aShipment is record number : Positive; -- number of containers quantity : aQuantity; -- the containers contents : aFruit; -- type of fruit end record; function sortCriteria( left, right : aShipment ) return boolean; -- for sorting the AVL tree package shipmentContainers is new BC.Containers( item=> aShipment ); -- basic fruit container package shipmentTrees is new shipmentContainers.Trees; -- basic tree support package shipmentAVLTrees is new shipmentTrees.AVL( sortCriteria, Storage_Manager => Global_Heap.Pool, Storage => Global_Heap.Storage ); -- our AVL tree support end fruit_avl;
package body fruit_avl is function sortCriteria( left, right : aShipment ) return boolean is begin return left.number < right.number; end sortCriteria; end fruit_avl;
AVL trees have slower inserts and deletes than binary trees but are faster than a normal binary tree for searching.
The subprograms are similar to a binary tree. The append procedures add child items to an item. A new function called "Arity" returns the number children an item has.
Multiway trees should not be Guarded.
Essentially, graphs are a generalization of maps where any number of items can be related to each other (as opposed to only two).
A directed graph is a set of items (vertices) that are connected by relationships (edges or "arcs"). Like a single linked list, a program can only move forward along an arc.
Items can also be linked to themselves.
The graphs-directed package provides the following subprograms:
There are four iterators: a graph iterator, and three iterators for visiting items (incoming, outgoing and both).
An undirected graph is a directed graph with pointers to both the previous and next item along an arc. Like a double linked list, a program can move forwards or backwards along an arc.
The graphs-undirected package provides the following subprograms:
There are two iterators: a graph iterator and an item iterator.
Graphs should not be Guarded.
The smart package provides the following subprograms:
with BC.smart; package depts is type departments is ( accounting, information_technology, shipping, human_resources ); type deptAccess is access all departments; package deptPtrs is new BC.smart( departments, deptAccess ); end depts;
with ada.text_io, depts; use ada.text_io, depts; procedure sp_demo is accountingPtr : deptPtrs.Pointer; accounting2Ptr : deptPtrs.Pointer; department : deptAccess; begin Put_Line( "This is a demo of the Booch components: smart pointers" ); New_Line; department := new departments'( accounting ); Put_Line( "Assigning dynamically allocate value to a smart pointer" ); accountingPtr := deptPtrs.Create( department ); Put_Line( "The accounting pointer points at " & deptPtrs.Value( accountingPtr ).all'img ); New_Line; Put_Line( "Assigning a smart pointer to a smart pointer" ); accounting2Ptr := accountingPtr; Put_Line( "The accounting pointer 2 points at " & deptPtrs.Value( accounting2Ptr ).all'img ); New_Line; Put_Line( "The memory is released when the program ends or no more pointers" ); Put_Line( "access the memory." ); end sp_demo;
This is a demo of the Booch components: smart pointers Assigning dynamically allocate value to a smart pointer The accounting pointer points at ACCOUNTING Assigning a smart pointer to a smart pointer The accounting pointer 2 points at ACCOUNTING The memory is released when the program ends or no more pointers access the memory.
Booch components can be guarded (manually "locking" the structure for exclusive access) or synchronized (implicit blocking) for multithreading purposes.
Guarding is implemented by creating extending a container type to a Guarded_Container using the GC.Containers.Guarded package. Guarded containers contain two new subprograms, "Seize" and "Release", to lock and unlock a container. (This is implemented using a semaphore.) Any Booch data structure can be made guarded using guarded containers, but in some cases guarding will not work as expected and should not be used (for example, with lists).
The basic semaphore locks individual objects (although it many not work as expected on certain structures such as lists, according to AdaPower.Net). The basic semaphore can be extended and customized by a programmer.
Rewriting the Bags example with guards:
with BC.Containers.Bags.Unbounded; with BC.Containers.Guarded; with BC.Support.Synchronization; with Global_Heap; package guarded_customers is type aCustomerID is new integer range 1_000..9_999; function IDHash( id : aCustomerID ) return Positive; -- our hash function package customerContainers is new BC.Containers (Item => aCustomerID); -- this is the basic container package customerBags is new customerContainers.Bags; -- create a new bag support for our using container type package customerBag is new customerBags.Unbounded( Hash => IDHash, Buckets => 99, Storage_Manager => Global_Heap.Pool, Storage => Global_Heap.Storage); -- create an unbounded bag holding customer numbers package customerGuardedBag is new customerContainers.Guarded ( Base_Container => customerBag.Unbounded_Bag, Semaphore => BC.Support.Synchronization.Semaphore ); -- create a new controlled tagged record container for customers end guarded_customers;
A new guarded bag can now be declared:
customers : customerGuardedBag.Guarded_Container;
and the bag can be locked using
customerGuardedBag.Seize( customers );
Synchronized access by threads is implemented in special versions of the data structure packages (for example, maps.synchronized). With synchronized packages, the implementation details are hidden from the user.
Ada Core Technologies provides a set of XML and Unicode packages called XMLAda. It is open source and can be downloaded from the ACT Europe's web site at https://libre2.adacore.com/xmlada/. Limited documentation is available online at this web site as well.
Unicode is bundled together with XML because XML uses Unicode characters. Unicode characters are not the same as Ada's standard 16-bit wide character type. Wide characters are based on a different standard.
This is an overview of using these packages, including short examples to see how they are used. It does not discuss the details of Unicode or XML.
When you configure XMLAda, you're given a choice of where to install the library (/usr/local might be a good choice). To compile, bind and link your programs, you'll need to include the locations of the library and Ada package spec files. For example, for /usr/local you will need to use these options:
If you are not using a particular subsystem, you can omit the -l library option. However, DOM is built using SAX--if you use the DOM packages, you'll need to include the SAX library.
Unicode comes in three versions: fixed length 32-bit UTF-32, variable length 16-bit UTF-16, and variable length UTF-8. All three versions can represent the same characters, from ASCII to ancient languages. UTF-32 is the basic Unicode type and each character is always 32 bits. UTF-16 is a compressed version of UTF-32: each character is 16-bits but some character as represented by two 16-bit characters in a row. UTF-8 has even more compression: the first 128 characters are equivalent to ASCII (or Latin-1), the standard Ada character sets, but the upper 128 codes can be used for characters using up to 6 consecutive bytes for one character. The Ada packages support all three types of Unicode.
An example of UTF-8 in many languages is available on the UTF-8 Sampler Page. This web page is composed of UTF-8 characters and uses "<META http-equiv="Content-Type" content="text/html; charset=utf-8"> in the HTML header. The xterm program is also supposed to support UTF-8 characters (for example, if you dump a UTF-8 document to standard output in an xterm window).
Do you need to know the character codes? They are available in sets of PDF charts at unicode.org.
The Unicode packages sometimes work with string types, sometimes with numeric types. A basic UTF-32 character (type unicode_char defined in unicode.ads) and is a 32-bit number.
type Unicode_Char is mod 2**32;
Basic UTF-32 functions such as is_letter, is_digit or to_unicode (ASCII/Latin-1 to UTF-32 character) are found in unicode.ads.
The first 127 characters in Unicode are identical to the 127 character ASCII set. If you are strictly working with these characters, it's easy to convert them to Unicode characters.
with unicode; use unicode; procedure ASCII_to_Unicode is utf_32 : unicode_char; -- UTF-32 character ascii_ch : character; -- Latin-1 (or ASCII) character begin utf_32 := character'pos( 'A' ); ascii_ch := character'val( utf_32 ); end ASCII_to_Unicode;
Characters above ASCII 127 will result in garbage or a run-time CONSTRAINT_ERROR exception.
The Unicode packages do not define a UTF-8 character type but you can create your own type.
procedure ASCII_to_UTF_8 is subtype unicode_utf_8_char is unicode_char range 0..255; utf_8 : unicode_utf_8_char; ascii_ch : character; begin utf_8 := character'pos( 'A' ); ascii_ch := character'val( utf_8 ); end ASCII_to_UTF_8;
Memory can be saved by limiting the UTF-8 character to 7 bits.
type unicode_utf_8_char is new unicode_char range 0..127; for unicode_utf_8_char'size use 7;
Latin-1 (the extended ASCII standard used by Ada) and the first 8 bits of UTF-32 are not identical. To convert between them, you'll need to use a Unicode conversion function. The Unicode packages contain many conversion functions under the unicode.ccs package hierarchy. In particular, unicode.css.iso_8859_1.ads performs Latin-1 conversions.
The following short program demonstrates how to convert between UTF-32 strings and Ada unbounded strings (Latin-1 characters). (The next section will show how to convert any string to any character set.)
with ada.text_io, ada.strings.unbounded, unicode.ccs.iso_8859_1; use ada.text_io, ada.strings.unbounded, unicode, -- basic Unicode unicode.ccs, -- Unicode conversions unicode.ccs.iso_8859_1; -- Latin-1 conversions procedure unitest is type a_unicode_string is array(1..80) of unicode_char; function to_unicode_string( msg : unbounded_string ) return a_unicode_string is -- to 32 byte unicode characters (UTF-32) result : a_unicode_string := (others => 0); begin for c in 1..length( msg ) loop result(c) := to_unicode( element( msg, c ) ); end loop; return result; end to_unicode_string; function to_unbounded_string( msg : a_unicode_string ) return unbounded_string is -- to Latin-1 result : unbounded_string; c : natural; begin c := 1; while c <= a_unicode_string'last and then msg( c ) /= 0 loop result := result & character'val( to_iso_8859_1( msg( c ) ) ); c := c + 1; end loop; return result; end to_unbounded_string; msg : unbounded_string := to_unbounded_string( "this is a test" ); umsg : a_unicode_string; msg2 : unbounded_string; begin put_line( "Original: " & to_string( msg ) ); umsg := to_unicode_string( msg ); msg2 := to_unbounded_string( umsg ); put_line( "After conversion: " & to_string( msg2 ) ); end unitest;
Using characters above Latin-1 255 will result in garbage or a run-time CONSTRAINT_ERROR exception.
UTF-32 characters are the most flexible form of Unicode but they use a large amount of space. UTF-8 uses less space than UTF-32 and is compatible with ASCII but the characters can require up to 6 consecutive bytes. Unfortunately, although the Unicode packages support UTF-8, it is not easy to use. It is not supported directly and all conversions must be done through UTF-32. This creates some confusing terminology as there are many "to_utf32" and "from_utf32" functions that work on different kinds of 32-bit characters (not necessarily "UTF" characters at all!).
Converting an Ada string into UTF-8 format is a 3 step process:
The unicode.ces packages include string types that work with the functions. String types with the name "UTF32" have 32-bit characters. "UTF8" strings have 8-bit characters. "LE" strings are "little-endian". These strings are all renamings of a standard Ada string and are provided to make programs easier to read. That is, a unicode_char is a number but a utf32_le_string is a string type, not an array of numbers as one might expect. Another effect is that you cannot use the unicode.ces packages and refer to the functions: they must be named in full so the compiler will know what functions you are referring to.
Putting it together, here is the method to convert an Ada string to a UTF-8 string:
s : string := "This is a test"; -- 8-bit Latin-1 string (normal Ada string) s_32 : utf32_le_string := unicode.ces.basic_8bit.to_utf32( s ); -- 32-bit Latin-1 string (normal Ada string with 32-bit characters) u_32 : utf32_le_string := unicode.ces.utf32.to_unicode_le( s_32, cs => unicode.ccs.iso_8859_1.iso_8859_1_character_set ); -- UTF-32 string (convert Latin-1 to Unicode characters) u_8 : utf8_string := unicode.ces.utf8.from_utf32( u_32 ); -- change UTF-32 to UTF-8
The UTF-8 string may have more bytes than the original string. Use the unicode.ces.utf8.length function (instead of the 'length attribute) to determine the number of characters in the string.
To convert from UTF-8 to Ada strings:
Here is a complete example that translates a string to UTF-8 and back to an Ada string:
with ada.text_io, unicode.ces.utf8, unicode.ces.utf32, unicode.ces.basic_8bit, unicode.ccs.iso_8859_1; use ada.text_io, unicode, unicode.ccs, unicode.ces, unicode.ces.utf8, unicode.ces.utf32; procedure there_and_back_again is s : string := "This is a test"; -- 8-bit Latin-1 string (normal Ada string) s_32 : utf32_le_string := unicode.ces.basic_8bit.to_utf32( s ); -- 32-bit Latin-1 character (normal Ada string with 32-bit characters) u_32 : utf32_le_string := unicode.ces.utf32.to_unicode_le( s_32, cs => unicode.ccs.iso_8859_1.iso_8859_1_character_set ); -- UTF-32 string (convert Latin-1 to Unicode characters) u_8 : utf8_string := unicode.ces.utf8.from_utf32( u_32 ); -- change UTF-32 to UTF-8 undo_u_32 : utf32_le_string := unicode.ces.utf8.to_utf32( u_8 ); -- change UTF-8 to UTF-32 undo_s_32 : utf32_le_string := unicode.ces.utf32.to_unicode_le( undo_u_32, cs => unicode.ccs.iso_8859_1.iso_8859_1_character_set ); -- change UTF-32 to Latin-1 with 32-bit characters -- KB: is this right? Would make more sense as to_cs but that function -- throws an exception... undo_s : string := unicode.ces.basic_8bit.from_utf32( undo_s_32 ); -- back to original string "This is a test" begin put_line( "Original string = '" & s & "'" ); put_line( " To UTF-8 = '" & u_8 & "'" ); put_line( " Back again = '" & undo_s & "'" ); if s = undo_s then put_line( "Translation successful" ); end if; end there_and_back_again;
The results:
Original string = 'This is a test' To UTF-8 = 'This is a test' Back again = 'This is a test' Translation successful
Using the same process, you can use UTF-16 by using the appropriate packages.
KB: Do the strings need to be free'd?
To provide a standard way of reading XML information, XMLAda includes a set of input packages. This packages are designed to read information from a file or string. You can also create custom input functions to read XML information from other sources.
The Input packages are designed to look like standard Ada file I/O operations. To use an XML file, use the Input_Sources.File package. You open the file, get a Unicode character, check for the end of file and close the file. The file could be a string or some other form of input. The Set_Encoding procedure informs the Input packages about the type of characters (Latin_1, UTF-8, etc.), although the encoding of a file can usually be determined automatically.
data : Input_Sources.File.File_Input; ... Input_Sources.File.Open( "data.xml", data ); -- Parse the information Input_Sources.File.Close( data );
If you are using SAX or DOM, you normally only need to open or close the XML source. The parsers will handle reading the XML data.
This program will dump the contents of an XML file called "data.xml" to standard output:
with ada.text_io, unicode, input_sources.file; use ada.text_io, unicode, input_sources.file; procedure xml_dump is data : File_Input; uc : unicode_char; begin Open( "data.xml", data ); while not Input_Sources.File.Eof( data ) loop Next_Char( data, uc ); if uc >= 32 and uc < 127 then -- Printable ASCII? put( character'val( uc ) ); elsif uc = 10 then -- Linefeed new_line; else put( "[#" & uc'img & ']' ); -- other character, print numeric code end if; end loop; new_line; Close( data ); end xml_dump;
Strings are more difficult. Although there is a Input_Sources.Strings package, there is no predefined handling for Ada or Unicode strings. You have to write a set of callbacks to read the string yourself.
with ada.text_io, unicode.ces, unicode.ccs.iso_8859_1, input_sources.strings; use ada.text_io, unicode, unicode.ces, input_sources.strings; procedure xml_dump_string is data : String_Input; uc : unicode_char; s : aliased byte_sequence := "<thing>This is a test</thing>"; -- s is the source string -- These callbacks are described in unicode.ces. For Latin_1, they need -- to work with one byte characters only. procedure s_read( Str : Byte_Sequence; Index : in out Positive; Char : out Unicode_Char) is begin char := character'pos( str( index ) ); index := index + 1; end s_read; function s_width( char : Unicode_Char ) return natural is begin return 1; end s_width; procedure s_encode( Char : Unicode_Char; Output : in out Byte_Sequence; Index : in out Natural) is begin output( index+1 ) := character'val( char ); end s_encode; function s_length( str : byte_sequence ) return natural is begin return str'length; end s_length; scheme : encoding_scheme := ( s_read'unrestricted_access, s_width'unrestricted_access, s_encode'unrestricted_access, s_length'unrestricted_access); -- a collection of callbacks to read string s -- I should not use unrestricted_access but don't want to fool around with -- Ada's pointer scoping restrictions for this short example... begin Open( s'unrestricted_access, scheme, data ); while not Eof( data ) loop Next_Char( data, uc ); if uc >= 32 and uc < 127 then -- Printable ASCII? put( character'val( uc ) ); elsif uc = 10 then -- Linefeed new_line; else put( "[#" & uc'img & ']' ); end if; end loop; new_line; Close( data ); end xml_dump_string;
The results:
<thing>This is a test</thing>
Interpreting XML is difficult because it has a recursive structure: using nested tags allows great flexibility but it means that any individual tag only has meaning when considered in the context of the surrounding tags. Or to put it another way, you can't grep XML.
XMLAda provides two methods for interpreting (or parsing) an XML document. The first is SAX (Simple API for XML) and is particularily useful for short documents or documents without a lot of nested context. SAX is sometimes referred to as as callback or push-based parsing. SAX reads through an XML document and calls procedures you define to handle the different parts of the document SAX encounters. Because SAX is a standard, using SAX in Ada is very similar to using SAX in other computer languages.
Create a set of handlers for different XML components (elements (or tags), attributes, free text, etc.). These handlers should be put in a tagged record extended from sax.readers.reader. With objects, you can create a hierarchy of readers for different kinds of XML files.
Since SAX uses callbacks, it doesn't have to load the entire XML document at one time. XML documents with simple structures can be quickly analysed.
The simplest SAX program reads through an XML file and does nothing by default:
with ada.text_io, unicode.ces, input_sources.file, sax.readers; use ada.text_io, unicode, input_sources.file, sax.readers; procedure sax_nothing is data : File_Input; r : sax.readers.reader; begin Open( "data.xml", data ); Parse( r, data ); Close( data ); end sax_nothing;
To do something useful, you have to extend the class and create new handler functions. Create a new tagged record based on sax.readers.reader and replace that default content handlers for the content you want to examine. The most common ones are:
The parameters and their names are based on the SAX standard. The most common parameters are:
There are many possible handlers. Check sax.readers.ads for a complete list.
If the XML file is incorrect, an XML_FATAL_ERROR exception will be raised.
RSS is an XML standard for reporting news articles. An RSS file contains news articles look like this:
<item> <title>The Headline</title> <link>Link to the article</title> ... </item>
Suppose you want to write a program to display news headlines. To do this, you'd need to capture the free text (the Characters callback) and display the text when the </title> tag is reached (the End_Element callback). But you only want to display the <title> tags inside of <item> tags because there are other titles (such as the title of the web site where the items came from).
Create a new package called my_readers to contain our RSS tagged record called my_rss_reader. The record will have handlers for starting and ending tags as well as character content. The titles are displayed with the dump procedure. This procedure assumes that the XML file is ASCII or Latin-1, but the procedure can be rewritten to handle Unicode as discussed above.
-- my_readers.ads with Sax.Exceptions; with Sax.Locators; with Sax.Readers; with Sax.Attributes; with Sax.Models; with Unicode.CES; with ada.strings.unbounded; use ada.strings.unbounded; package my_readers is type my_rss_reader is new sax.readers.reader with private; procedure Start_Element (Handler : in out my_rss_reader; Namespace_URI : Unicode.CES.Byte_Sequence := ""; Local_Name : Unicode.CES.Byte_Sequence := ""; Qname : Unicode.CES.Byte_Sequence := ""; Atts : Sax.Attributes.Attributes'Class); -- The start of a tag. e.g. <title> procedure End_Element (Handler : in out my_rss_reader; Namespace_URI : Unicode.CES.Byte_Sequence := ""; Local_Name : Unicode.CES.Byte_Sequence := ""; Qname : Unicode.CES.Byte_Sequence := ""); -- The end of a tag. e.g. </title> procedure Characters (Handler : in out my_rss_reader; Ch : Unicode.CES.Byte_Sequence); -- The free text contained between tags. Ch is a string not a single character. -- The rest we don't need private type my_rss_reader is new sax.readers.reader with record title : unbounded_string; -- last free text in_item : boolean := false; -- true if in an <item> end record; end my_readers;
-- my_readers.adb with ada.text_io; use ada.text_io; package body my_readers is procedure dump( ch : unicode.ces.byte_sequence ) is -- Dump some ASCII-compatible characters. This assumes ASCII or -- Latin-1 characters in the XML file. c : character; ch_code : natural; begin for i in 1..ch'length loop c := ch(i); ch_code := character'pos( c ); if ch_code >= 32 and ch_code < 127 then -- Printable ASCII? put( c ); elsif ch_code = 10 then -- Linefeed new_line; else -- Others put( "[#" & ch_code'img & ']' ); end if; end loop; end dump; procedure Start_Element (Handler : in out my_rss_reader; Namespace_URI : Unicode.CES.Byte_Sequence := ""; Local_Name : Unicode.CES.Byte_Sequence := ""; Qname : Unicode.CES.Byte_Sequence := ""; Atts : Sax.Attributes.Attributes'Class) is begin if local_name = "item" then -- <item> Handler.in_item := true; -- starting an item end if; end Start_Element; procedure End_Element (Handler : in out my_rss_reader; Namespace_URI : Unicode.CES.Byte_Sequence := ""; Local_Name : Unicode.CES.Byte_Sequence := ""; Qname : Unicode.CES.Byte_Sequence := "") is begin if local_name = "title" then -- </title> if Handler.in_item then -- item title? dump( to_string( handler.title ) ); -- show it new_line; end if; elsif local_name = "item" then -- </item> Handler.in_item := false; -- leaving an item end if; end End_Element; procedure Characters (Handler : in out my_rss_reader; Ch : Unicode.CES.Byte_Sequence) is begin handler.title := to_unbounded_string( ch ); end Characters; end my_readers;
-- rss.adb with ada.text_io, unicode.ces, input_sources.file, my_readers; use ada.text_io, unicode, input_sources.file, my_readers; procedure rss is data : File_Input; r : my_rss_reader; begin Open( "coder.rss", data ); Parse( r, data ); Close( data ); end rss;
The results for my coder.rss file for my Lone Coder blog is as follow:
Lone Coder: Google: Lawful Good or Chaotic Neutral? Lone Coder: The Tyrrany of the Label Lone Coder: Losing Control of your Linux Startup Lone Coder: OpenSuSE 10: Developer Dream or Crippleware? Lone Coder: In Mourning of Statftime Lone Coder: The Need for Speed: Speedbumps on the Web
An attribute is additional information added to an XML tag. For example, in HTML with <p align="right">, align="right" is an attribute. align is the name of the attribute. right is the value of the attribute.
The Start_Element handler has an Atts tagged record that describes any attributes included with the tag. The subprograms available for attributes are described in sax-attributes.ads. The most commonly used subprograms are:
Attributes are numbered from zero. To display the attributes on an XML tag, use the following in your Start_Element handler:
use SAX.Attributes; ... for i in 0..get_length( Atts )-1 loop put_line( "Name = " & get_local_name( Atts, i ) ); put_line( "Value = " & get_value( Atts, i ) ); end loop;
SAX calls your callbacks as it parsers an XML file. DOM, Document Object Model, works by loading the entire XML file and building a tree based on the tags. Using the DOM packages, you can "walk" the tree, node by node. This uses more memory but doesn't constrain how you examine the document the way SAX does (top to bottom). DOM, like SAX, is a standard and using DOM in Ada is similar to using DOM in other languages.
The default DOM reader, Tree_Reader, is used to create the document tree. Unlike SAX, the default reader will provide most of the functionality you need and you won't need to extend it.
with input_sources.file, dom.readers, dom.core; use dom.readers, dom.core, input_sources.file; procedure dom_nothing is data : File_Input; r : Tree_Reader; d : Document; begin Open( "data.xml", data ); d := get_tree( r ); Parse( r, data ); free( r ); Close( data ); end dom_nothing;
A longer example: display the first XML tag with attributes.
with input_sources.file, dom.readers, dom.core.documents, dom.core.nodes; use dom.readers, dom.core, dom.core.documents, dom.core.nodes, input_sources.file; with ada.text_io; use ada.text_io; procedure dom_first_node is procedure dump( n : node ) is begin if n /= null then case n.node_type is when element_node => -- An XML tag put( "<" ); put( node_name( n ) ); -- show name declare -- attributes AS : Named_Node_Map := Attributes (N); begin for I in 0..Length( AS )-1 loop -- numbered from zero put( " " ); put( node_name( Item(AS, I) ) ); put( "=" ); put( node_value( Item(AS, I) ) ); end loop; end; put_line( ">" ); when attribute_node => -- An attribute put_line( node_name( n ) & "=" & node_value( n ) ); when others => -- Other put_line( "Unknown tree node type" ); end case; end if; end dump; data : File_Input; r : Tree_Reader; d : Document; document_root : element; begin Open( "data.xml", data ); Parse( r, data ); d := get_tree( r ); document_root := get_element( d ); dump( document_root ); free( r ); Close( data ); end dom_first_node;
For my coder.rss file, this program displays:
<rss version=2.0>
The dom-core-nodes.ads package includes navigation subprograms to move around the document and examine nodes (XML items). first_child returns the first child nested in an XML element. next_sibling moves to the next adjacent item. For a complete example, David Botton has an example program that walks an XML document using DOM at his AdaPower web site.
AdaCL is a library for writing small day-to-day programs normally handled by scripting languages. It includes CGI, garbage collection and improved string functions. The home page for the project is AdaCL.
An alternative to my BUSH project
.SAL is another "standard template library"-type project providing general data structures, vectors and math functions with a heavy emphasis on generics. The library is located at http://www.toadmail.com/~ada_wizard/ada/sal.html.
Summary of packages:
SAL.Aux.Definite_Private_Items; SAL.Aux.Definite_Private_Items; SAL.Aux.Enum_Iterators; SAL.Aux.Indefinite_Limited_Items; SAL.Aux.Indefinite_Private_Items.Comparable; SAL.Aux.Indefinite_Private_Items; SAL.Aux.Sort_Indefinite_Items_Definite_Keys; SAL.Config_Files.Boolean; SAL.Config_Files.Duration; SAL.Config_Files.Integer; SAL.Config_Files.Time; SAL.Config_Files; SAL.Endianness; SAL.Gen.Alg.Count; SAL.Gen.Alg.Find_Binary; SAL.Gen.Alg.Find_Linear; SAL.Gen.Alg.Find_Linear.Sorted; SAL.Gen.Alg.Process_All_Constant; SAL.Gen.Alg; SAL.Gen.Gray_Code; SAL.Gen.Lists.Double.Iterators; SAL.Gen.Lists.Double; SAL.Gen.Lists; SAL.Gen.Stacks.Bounded_Limited; SAL.Gen.Stacks.Bounded_Nonlimited; SAL.Gen.Stacks; SAL.Gen.Word_Order_Convert.Scalar_32; SAL.Gen.Word_Order_Convert.Scalar_64; SAL.Gen.Word_Order_Convert; SAL.Gen_Array_Image; SAL.Gen_Array_Text_IO; SAL.Gen_FIFO; SAL.Gen_Math.Gen_Den_Hart; SAL.Gen_Math.Gen_DOF_2; SAL.Gen_Math.Gen_DOF_3.Gen_Image; SAL.Gen_Math.Gen_DOF_3; SAL.Gen_Math.Gen_DOF_6.Gen_Image; SAL.Gen_Math.Gen_DOF_6.Gen_Integrator_Utils; SAL.Gen_Math.Gen_DOF_6; SAL.Gen_Math.Gen_Inverse_Array; SAL.Gen_Math.Gen_Manipulator; SAL.Gen_Math.Gen_Runge_Kutta_4th; SAL.Gen_Math.Gen_Scalar; SAL.Gen_Math.Gen_Square_Array.Gen_Inverse; SAL.Gen_Math.Gen_Square_Array; SAL.Gen_Math.Gen_Vector.Gen_Image; SAL.Gen_Math.Gen_Vector; SAL.Generic_Binary_Image; SAL.Generic_Decimal_Image; SAL.Generic_Float_Image; SAL.Generic_Hex_Image; SAL.Math_Double.Den_Hart; SAL.Math_Double.DOF_2; SAL.Math_Double.DOF_3.Cacv_Inverse; SAL.Math_Double.DOF_3.Image; SAL.Math_Double.DOF_3; SAL.Math_Double.DOF_6.Image; SAL.Math_Double.DOF_6.DC_Array_DCV_Inverse; SAL.Math_Double.DOF_6.Integrator_Utils; SAL.Math_Double.DOF_6; SAL.Math_Double.Elementary; SAL.Math_Double.Scalar; SAL.Math_Double.Text_IO; SAL.Math_Float.Den_Hart; SAL.Math_Float.DOF_2; SAL.Math_Float.DOF_3.Cacv_Inverse; SAL.Math_Float.DOF_3.Image; SAL.Math_Float.DOF_3; SAL.Math_Float.DOF_6.DC_Array_DCV_Inverse; SAL.Math_Float.DOF_6.Integrator_Utils; SAL.Math_Float.DOF_6; SAL.Math_Float.Elementary; SAL.Math_Float.Polynomials; SAL.Math_Float.Scalar; SAL.Math_Float.Text_IO; SAL.Math_Float_Kraft_HC_Nominal; SAL.Math_Float_Manipulator_6; SAL.Math_Float_Manipulator_7; SAL.Math_Float_RRC_K1607_Nominal; SAL.Memory_Streams.Address; SAL.Memory_Streams.Bounded; SAL.Memory_Streams; SAL.Poly.Alg.Count; SAL.Poly.Alg.Find_Linear; SAL.Poly.Alg.Process_All_Constant; SAL.Poly.Binary_Trees.Sorted.Iterators; SAL.Poly.Binary_Trees.Sorted; SAL.Poly.Binary_Trees; SAL.Poly.Function_Tables.Monotonic.First_Order; SAL.Poly.Function_Tables.Monotonic; SAL.Poly.Function_Tables; SAL.Poly.Lists.Double; SAL.Poly.Lists.Single.Iterators; SAL.Poly.Lists.Single; SAL.Poly.Lists; SAL.Poly.Stacks.Unbounded_Array; SAL.Poly.Stacks; SAL.Poly.Unbounded_Arrays; SAL.Poly; SAL.Simple.Function_Tables.Monotonic.First_Order; SAL.Simple.Function_Tables.Monotonic; SAL.Simple.Function_Tables; SAL.Simple.Searches.Binary; SAL.Simple.Searches; SAL.Simple; SAL.Time_Conversions;
<--Last Chapter | Table of Contents | Next Chapter--> |