Algorithms For Dummies. John Paul Mueller

Algorithms For Dummies - John Paul Mueller


Скачать книгу

      Even when you find a good solution, one that is both efficient and effective, you still need to know precisely what the solution costs. You may find that the cost of using a particular solution is still too high, even when everything else is considered. Perhaps the answer comes almost, but not quite, on time or it uses too many computing resources. The search for a good solution involves creating an environment in which you can fully test the algorithm, the states it creates, the operators it uses to change those states, and the time required to derive a solution.

      Representing the problem as a space

      A problem space is an environment in which a search for a solution takes place. A set of states and the operators used to change those states represent the problem space. For example, consider a tile game that has eight tiles in a 3-x-3 frame. Each tile shows one part of a picture, and the tiles start in some random order so that the picture is scrambled. The goal is to move one tile at a time to place all the tiles in the right order and reveal the picture. You can see an example of this sort of puzzle at https://www.proprofsgames.com/puzzle/sliding/.

      The combination of the start state, the randomized tiles, and the goal state — the tiles in a particular order — is the problem instance. You could represent the puzzle graphically using a problem space graph. Each node of the problem space graph presents a state (the eight tiles in a particular position). The edges represent operations, such as to move tile number eight up. When you move tile eight up, the picture changes — it moves to another state.

      Winning the game by moving from the start state to the goal state isn’t the only consideration. To solve the game efficiently, you need to perform the task in the least number of moves possible, which means using the smallest number of operations. The minimum number of moves used to solve the puzzle is the problem depth.

      You must consider several factors when representing a problem as a space. For example, you must consider the maximum number of nodes that will fit in memory, and whether the number of nodes that memory can support matches the expected number of nodes necessary to solve the problem, which represents the space complexity. When you can’t fit all the nodes in memory at one time, the computer must generate them only when necessary and then discard the previous nodes to free memory or store some nodes in other locations. To determine whether the nodes will fit in memory, you must consider time complexity because longer algorithm runs determinate the maximum number of nodes created to solve the problem. In addition, it’s important to consider the branching factor, which is the average number of nodes created at each step in the problem space graph to solve a problem. For the same solution, an algorithm with a higher branching factor will generate more nodes than one with a lower branching factor.

      Going random and being blessed by luck

      Solving a search problem using brute-force techniques (described in “Avoiding brute-force solutions,” earlier in this chapter) is possible. The advantage of this approach is that you don’t need any domain-specific knowledge to use one of these algorithms. A brute-force algorithm tends to use the simplest possible approach to solving the problem. The disadvantage is that a brute-force approach works well only for a small number of nodes. Here are some of the common brute-force search algorithms:

Technique Description Cons Pros
Breadth-first search Begins at the root node, explores each of the child nodes first, then moves down to the next level. It progresses level by level until it finds a solution. Must store every node in memory, which means that it uses a considerable amount of memory for a large number of nodes. Can check for duplicate nodes to save time and always comes up with a solution.
Depth-first search Begins at the root node and explores a set of connected child nodes until it reaches a leaf node. It progresses branch by branch until it finds a solution. Can’t check for duplicate nodes, which means that it might traverse the same node paths more than once. It’s memory efficient.
Bidirectional search Searches simultaneously from the root node and the goal node until the two search paths meet in the middle. It’s time efficient and uses memory more efficiently than other approaches, and it always finds a solution. Complexity of implementation, translating into a longer development cycle.

      Using a heuristic and a cost function

       Pure heuristic search: Expands nodes in order of their cost. It maintains two lists. The closed list contains the nodes it has already explored; the open list contains the nodes it must yet explore. In each iteration, the algorithm expands the node with the lowest possible cost. All its child nodes are placed in the closed list and the individual child node costs are calculated. The algorithm sends the child nodes with a low cost back to the open list and deletes the child nodes with a high cost.

       A * search: Tracks the cost of nodes as it explores them (and choosing the least expensive ones) using this equation: f(n) = g(n) + h(n), wheren is the node identifier.g(n) is the cost of reaching the node so far.h(n) is the estimated cost to reach the goal from the node.f(n) is the estimated cost of the path from n to the goal.

       Greedy best-first search: Chooses the path that is closest to the goal using the equation f(n) = h(n). It can find solutions quite quickly, but it can also get stuck in loops, so many people don't consider it an optimal approach to finding a solution.

      Gaining insights into precisely how algorithms work is important because otherwise you can’t determine whether an algorithm actually performs in the way you need it to. Also, without good measurements, you can’t perform accurate comparisons


Скачать книгу