Welcome to TDBSoverflow, Our class's own StackOverflow. Our rules:
  1. Use only meaningful and self-explanatory titles
  2. Tag your questions with meaningful keywords
  3. Use upvotes and downvotes to rate the answers
  4. When you receive a satisfying answer - Click the "V" button
Remember: you may get up to 5 bonus points to your final grade!

SORTED UNCULSTERD INDEX

+1 vote
42 views
האם אני רוצה לעשות

JOIN

על שתי טבלאות, כאשר על השדות עליהן עושים את האיחוד בטבלה אחת יש

SORTED UNCULSTERD INDEX

ובשניה

SORTED CULSTERD INDEX

במידה ואני עושה

SORT MERGE

האם העובדה ששני האינקסים הם ממויינים מספיקה כדי לבצע את זה בלי מיון?

או שזה שהאינקס הוא

UNCULSTERD

גורם לצורך כן להפעיל את המיון ?
asked Feb 4, 2018 by tikitak (2,970 points)

3 Answers

+1 vote
כן!

בשביל פעולת ה

JOIN

אנחנו צריכים שהרשומות יהיו ממויינות בזכרון (בבלוקים)

unclustered index

לא עוזר במקרה הזה, כי יכול להיות שבשביל רצף כלשהו של רשומות נצטרך לטעון את כל הבלוקים של הרלציה.
answered Feb 4, 2018 by oz (12,430 points)
אז מה נותן לי בעצם
SOTERD unclustered index
כלום בתאכלס ?
+1 vote
Correct me if I'm wrong but i would say that getting the relation with the unclustered index sorted would cost:

min { T(R),  2k*B(R) }

since theoretically it is possible that getting T(R) pages of entries will cost less then sorting.
answered Feb 5, 2018 by Yossielman (990 points)
I think in any realistic DB T(R) will always be bigger than 2k * B(R) considering we were told that even the biggest DB in the world it takes 2-3 phases of The external Merge-Sort algorithm and the sorts in Merge-Sort join use this algorithm to sort so we k is no greater than 3.
0 votes
I don't think we have to sort S just because the index is unclustered, we just have to use the index more intelligently.

If instead of reading matching tuples in S you collect all the addresses of the matching tuples, sort them and then fetch the pages with respect to this list of sorted addresses.

This gives us min{B(S), T(S)/V(S,col)} cost for reading.

If the index fits in fast memory than we can do all this with 0 IO (except the initial reading of the index but usually we have it already in memory).

Either way its far better than sorting the relation S so that the data will be clustered.
answered Feb 5, 2018 by uribracha (3,110 points)
...