• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

KossmanDistributedQueryProcessing

Page history last edited by PBworks 16 years, 10 months ago

Paper

  • distributed QP is now needed (before distributed DB was ahead of its time)
    • cost and scalability of large numbers of small machines better than mainframe
    • integration of different software - each must be used to solve a specific problem, but each uses it's own DB
    • integration of legacy systems
    • new applications need some type of distributed DB
  • distributed: implies heterogenousness, parallel: homogeneous; though both have same goals
  • textbook query processing:
    • query parser (flex, bison can be used)
    • query rewrite (no optimizations done; simply simplifies statements, unnests subqueries, figures out which partitions in a DDB need to be queried)
    • query optimizer (determine which functions are used for joins, memory allocation, ordering, choosing plans)
    • plan (represented as a tree, with consumers and producers implementing different operations (scan, sort, group, etc)
    • plan refienment/code generation (transforms plan into something executable, like (psudo)code)
    • query execution engine (use iterators to flow data through operators, take advantage of pipelining)
    • catalog (contains DB metadata for parsing, rewriting, and optimizing)
  • dynamic programming query optimization: iterate all possible access plans, pruning at each step to remove slow ones
  • in DDB, cost includes shipping result to interested parties
  • can pick plans based on response time, resource usage, then are added together
  • when sending data to interested sites, batch send rows
    • packs more info into packets
    • helps with burstiness (can queue up rows to be processed)
  • multicasts can be modeled as message passing from one site to another (SEA -> munich -> passau)
  • multi-threading: optimizer must determine if sub query expressions should be evaluated in different threads; could speed things up or thrash disk
  • joins with partitioned data: must determine where data is, then optimize where do do the join, eg (A1 U A2) J B or (A1 J B) U (A2 J B)
  • semi joins only send over needed data to see if there is a match (could also create bloom filter), but exparamental results show it's usually not worth it
  • double pipelined hashing joins:
    • two hash tables, initially empty
    • process all A tuples, then B
    • to process A, check B hash table to any matching elements, then output
    • reverse for B
    • TODO: what?
    • outputs results quickly on distributed systems
  • pointer based joins: instead of dereferenced pointers (which are stored in the DB) to objects (stored elsewhere) to get predicates, join Emp.DeptRef = Dept.address
  • object assembly groups objects by their relation to others so that only one round trip to remote sites is needed to build up an object
  • top/bottom N solutions:
    • `stop` operator
    • probe different tables to make sure that you only look for results in that could be in the top 10 first
  • P2P: every site can act as a server, eg the database is broken up onto sites
  • strict client-sever: each site servers only one role
    • easier on security
    • client can keep GUI hardware, server is disks and procs
    • midleware useful because you can scale any part of it
  • multitier: heigharchtical: each site acts as a server to subordinates, client to supriors
  • query shipping: typical, send the query to the data. but you must have joiner if the data lives on seperate machines
  • data shipping: client operates on data and caches it for future queries
  • hybrid: based on how much the client already has cached
  • site selection in query optimization: must decide which logical sites can and should perform different operations (eg. client can display)
  • when & where to optimize: which site should do optimization, based on current stats or guesses?  should you make plans at compile or query time? mixed (choose from several copiled options)?
  • for distributed or parallel: 2-step: 1) at compile time figure out order of operations, 2) at run time figure out sites
  • if you do a mixed shipping method, how do you update the server if you've written data locally?
  • for heterogenous systems, usually you create a wrapper around different components to let the mediator talk to them
  • wrapper or component can broadcast capabilities to mediator, hard to implement
  • you can use generic B_Fetch and R_Scan operator to run queries to get data you're interested in
  • wrappers/components can estimate costs using:
    • calibration approach: make a generic cost model for all components, then adjust diferent args. lets developers easily fit their wrapper into scheme
    • individual wrapper costs: harder to write, but can be used as a overwrite of the generic model
  • hetergenous techniques try to encompass the hetero part, provide homogenous interfaces, then use existing techniques
  • many differences between caching and replication: target, granularity, storage, update, removeal, tigger mechanism
  •  when and where to optimize?
    • store compiled plans on the server and invalidate them when it becomes impossible to run
    • always dynamically generate them, but can run into problems when guesses about the distributed state are wrong
    • can verify your guess during the query, and if it's wrong reoptimize with your progress so far
  • two step optimization: compile time pick the operators, query time pick the sites
  • heterogenous systems:
    • clients connect to a mediator with knowledge of the system
    • mediator connects to different DBs through wrappers
    • different wrappers publish information about themselves, possibly using capability records
    • the query uses accessPlan and joinPlan functions of wrappers to construct subplans to set up the dynamic programming table of possible plans
      • as an example, accessPlan for BigBook.com is R_Scan with a city filter, or B_Fetch with a city, category, and name filter
    • cost estimation:
      • calibration: come up with a general model, then plug in args (easy for developers, but somewhat inflexible)
      • individual wrapper cost: write out specific costs for different situations (hard for a developer, but could use it as a calibration override)
      • learning curve: learn and remember as you go (can generate very bad plans, but can use it in combination with above to adapt)
    • bindings: fetch as needed different tuples from different DBs, mediator binds arguements and runs query
    • cursor caching: cache the plan, then call with different bindings (prepare, execute)
  • data placement: usually static, but dynamic can improve performance
    • can optimize for network messages or speed
    • example: ADR: use a tree, replicate data in "regions" of sites based on query types, ROWA
      • expansion test: add sites if they or their subtrees are generating more requests for that object than the rest
      • contraction: drop site if more updates are sent than are read
  • cache investment: figure out the cost and benefit for shipping data to a client. usually there is a net benefit when running multiple queries
    • to implement, query optimizer must keep track of what queries have been sent, and generate non-optimal queries for the future, then remember that the data has been cached for future optimizations
  • alternative: cache materialized views. lots of problems: expiration, optimization, shipping
  • econmic models for distributed queries: clients specify importance/price, brokers try to execute query while maximizing profit.
    • can use different bidding models (run by brokers, bid on by leafs)
  • another alternative is pushed based systems
    • can scale better when it's a broadcast system
    • but no SQL like language has been formalized
  •  

Lecture

  • system R concepts
    • P matching an access path
    • interesting orders (joins, sorts, grouping)
    • exhastively consider all of the plans, pick the cheapest
    • consider all single table access plans, keep the cheapest for each interesting order
    • look at all 2way joins, (originally only left deep), + available methods (nested loops, sort merge)
    • repeat... 3way...
    • pick cheapest
  • some differences in R* (distributed)
    • (horizontally) paritioned data
    • replicated data
  • must include costs of communication
    • tuples * cost
    • tuples * bytes / bytes per packet * packets
    • more fancy (not recommended for estimations)
  • single-replication,m single variable Qs
    • optimal query is independent of msg cost
    • if we distribute data, we may distribute load
    • stream results back as tuples are produced
    •  
  • when you have horizontal partitioning, you may be able to optimize sites based on where the data is
    • eg, if predicate is WHERE EMP.loc = SF, then only have to do query in sf
  • R*, to simplify, assumed that n-way joins are simply a pipleine of joins (left deep)
    • in additino to the R query optimization plans, additionally keep track of which site you end up
  • 3 distributed join options (you can use any join method (hash, merge, etc):
    • do it at the inner table: outer sends tuples to inner table, its' pipelined, inner table emmits tuples
    • do at outer table: inner send to outer, need temp space for inner tuples  when doing a NL (sequential instead of pipelined)
    • do somewhere else: inner to joni site is pipelined, outer to join site is sequential. can be used if the next step is elsewhere
    • fetch as needed: (binding) as you go through outer, send requests to inner, get tuples back, continue
      • retrieve qualified outer tuples
      • send thier join column values to inner site
      • retrieve matching inner tuples and return them to the outer site
      • join the incoming tuples with the waiting outer tuples (at the outer site)
      • repeat
    • semijoin:
      • outer[join col] -> inner [message 1]
      • M1 J inner table -> outsite [message 2]
      • run join at the outer (so you're only doing join on tuples you know will match)
      • helps with duplication of outer tuples
      • can approximate with bloom filters:

Comments (0)

You don't have permission to comment on this page.