Press here to go back 
-----

Frequently Asked Questions for Computational Neuroscience, Exercise 2

1. May I assume that at T(N) every object overlaps its center of T(n - 1)?
A. No.

2. Should we implement things in the way the brain work ?
A. Not exactly. Please read first section of "Project definition".

3. When emerging from the side, a rectangle might look like a triangle; when approaching from infinity, all types begin as a single pixel. Thus, we can identify the type only after we have dealt with the object for some
time. Is this OK ?
A. Yes. Just make sure your output file follows the right format.

4. Is the project definitions different for different sized teams ?
A. No.

5. May we assume that one of eyes is locked on the same point during the whole session?
A. You may assume that both eyes are fixate. Remember that the disparity measure is relative (it depends on the location of the origin of axis in each image). We will say that the disparity of two pixels is 0 if they share the same X coordinate. If the center of an object is located at (x,y) on the left image and at (x + delta , y) on the right image, then we say the disparity is delta.

6. Are triangles equilateral ? isosceles ? can we assume anything about the angles ?
A. You may assume nothing about the triangles.

7. Can we assume that shapes center are never closer then 30 pixels (the maximum movement of a shape per frame) ? If not, can you give us a hint of how to correlate a shape with its center ?
A. No. But you may assume that the center of an object will be closer to its location in the previous frame than any other object.

8. When objects come into the screen, we don't know their exact center, thus the disparity calculation maybe incorrect (we calculate disparity based on the part of the shape we see). Is that ok?
A. As all shapes are flat, the disparity of center is equals to the disparity of a corner. Thus you may calculate the disparity as soon as a shape (partially) appears.

9. Must we use edge detectors ? Can we use other operators instead  (for example calculating "cornerness measure" -  how many of the R - radius neighbors of a pixel have the same color or "roundness  measure"  - perimeter ^ 2 / area is lowest for circles) ? It seems like the problem can be solved without using edge detectors at all...
A. Your system should be a rough approximation of V1. The problem of detecting the objects can be solved in other ways, but this is not your task here. You may use other operators together with the edge detectors, as long as you justify why they are a rough approximation of V1.

10. Given the disparity of an object, how do we calculate its depth?
A. You don't need to. On the z coordinate you should output the disparity value. We assume that an object at infinity has disparity 0, that is - has the same x coordinates on the given left and right images.

11. Can we assume that an object approaching / retrieving from the 'eye' will create a difference in disparity in all cases?
A. Yes.

12. Is it required to implement the whole project using neural networks? In order to match objects in two images, we need to find the nearest object to the previous one. can we use the mathematical function minimum or do we need to use Neural network?
A. You don't have to implement the whole project using neural networks. Nevertheless, it should be a rough approximation of V1, therefore every operator you use should be implementable in neural networks, and plausible model for V1 (justify your solution).

13. Is it possible to build a different neural layer for each object in the image, and then process each separately, or do we need to process them all together?
A. You may choose to do so, as long as at the end you have a whole single system which performs the described task.

14. In the project definition and question # 7 in the FAQ you used the word 'close'. How do you define the distance ?
A.  sqrt ( dx^2 + dy^2 )

15. Can a shape in one picture (left or right) have no match in the other picture (right or left) in a certain frame?
A. This might happen only because of disparity - when the shape is near the frame of the image.

16. Can a shape appear through all frames only partially (be cut in the side by the border), or is there always a frame in which all of the shape
 appears?
A. No. Any shape will always detach from all borders at least in one of the frames.

17. It is said that we can use the 'images' toolbox.  Are we allowed to use all functions in the 'images' toolbox, especially bwlabel?
A. You are allowed to use any function, as long as it is an approximation of what the visual cortex does. Do justify your algorithm. See Q9.

18. Do we have to use neural networks themselves, or do we have to use methods that can be applied to neural networks?
A. The latter, but you have to give a good explanation of why it is "implementable" in n.n. and why it is a plausible model of the visual cortex.

19. What is considered an "edge detector"? Can we use any function  that returns points that are in lines of a certain angles in the image ? May
we detect the edges as we wish?
A. see answer for Q9 and Q17.

20. Can the two eyes see different objects?
A. Only on the boundaries (because of the disparity).

21. Can one object appear and another disappear on a single frame ?
A. Yes.

22. Can more than one object disappear on a frame?
A. Yes.

23. Can there be a situation in which an object left the frame, and a new object entered so that the new object center is within the 30 pixel radius of the previous object ?
A. No.

23. What is meant by the edge detectors being of "various size" ?
A. It means that they give strongest response to edges os a specific length / width .

24. Does V1 accept input in log-polar coordinates ?
A. Yes, but you don't need to use this fact in your model.

25. We need to make the approximation of the V1, the question is how rough this approximation should be ? Can you give a general guidance about what functions of matlab are sufficiently "low level" to be used ? for example, can we use conv2 function ?
A. You may use conv2 function, but note Q9 and Q19.

26. I have a "Miluim" which, according to your web page entitles me a late deadline. What should I do ?
A. Just add a photocopy of the "Ishur" when you submit the exercise (please note the rules).

27. How accurate should the results be ?
A. A small fluctuations of +-5 pixels are allowed. The output format should exactly match the given example.

28. How do you define the center of an object ?
A. The center of an object is the the average of all its pixels' coordinates.

29. Where can I find additional info about the physiology and functionality of the visual cortex ?
A. You may look at the chapter about vision in "Physiology of Behavior" by Carlson (can be found in social sciences library).

30. From the definition of the exercise I concluded that an object in the left eye can be +10 to -10 pixels far (in the X axis) from its twin object in the
right eye. Can it be that another object will exist in this range ? Can it be that when putting  the right picture over its twin left picture
different objects will partly overlap ? To sum it up what is the criteria to determine who is the twin object of a left eye image in the right eye image and vice versa ?
A. The criteria are better described in the two papers suggested for your stereo algorithm. Note two important issues:
a. Your algorithm for calculating the depth map may work on the low level input (that is - before you "know" what the object is).
b. Say a confusing situation exist (for example - what you described above) -  in the sense that a person that watches this scene might:  not understand what s/he; or perceives it as two alternating scenarios;  then we expect a good model of the visual system to have (similar) problems in "understanding" this input.

31. Can we assume that every object eventually reaches a certain size (big enough to find its shape)?
A. Yes.

32. When we calculate an object's center, is the function that calculates it has to be correlated with a similar operator in the visual cortex ?
A. Not really. The hole model should, but the exact coordinates of each object are more for enabling a check of your result.

33. We understood that the 'left' pictures represent what the left eye sees and the 'right' pictures represent what the right eye sees.
However, in all the 'left' pictures (left3.bmp for example) the objects appear to be closer to the left picture boundary than in the 'right'
pictures (for example, right3.bmp), while in reality the right eye sees objects left to where the left eye sees them. So, shouldn't the objects be
closer to the left picture boundary in the 'right' pictures?
A. Focus on the wall in front of you. Put a finger in front your eyes, but keep looking on the wall. You will notice a contradiction
for what you have just said. For more details, you may look at "Vision" by Marr, page 113.

34. In the project description you said that in every frame only one new object may appear, but on the 6th image two new objects appear.
We relied on the assumption that only one new object appears every time in order to distinguish the new object from other, and from there on to "follow" it. For example, how can we do the depth from stereo calculation when two new object may appear at the same time ?
A. This was indeed a mistake. Nevertheless, the only reason for allowing only one new object at a time was to have a
unique order of the objects in the output file. Note that the left - right correlation for the disparity calculation need not be performed on the high level "recognition", but on the lower level by one of the two algorithms given to you in class.
Of course in a case where two new object appear at the same time, it doesn't matter which one will appear first in the output file (all possibilities will be accepted).

35. Which center should we output in case of non zero disparity (left/right/average) ?
A. Left.

36. What disparity should be given when an object appears in one eye only (maybe shortest distance from boundaries) ?
A. Either shortest distance from boundaries or zero.

37. Some of the functionality we are to implement are done in the brain in other parts than V1. What should we do ?
A. Try to implement the things that are in V1 as described in the project definitions. The (more complex) features - should be reasonable. Most important - explain what you did and why you did it this way.

38. Which center should we output in case of non zero disparity when only right object is seen ?
A. Right.

39. What is the minimal size an object will get at some point of the movie ?
A. Any object will be at list 6 x 6 pixels at one frame (where 6 x 6 is the size of the minimal bounding rectangle).

40. Where should we store the results ?
A. './output.txt'  (that is, at /cns00b/output.txt).

41. On which matlab version would the project be checked (on which server    should we test it - nova, libra, plab-##, etc.) ?
A. libra.

42. The project specifies that we should use something like receptors identifying different lines and angles and use them to identify objects,
there is an enormous number of possible lines and angles, how are we supposed to handle so much data and deduct something from it (except by
neuron networks) cant we just identify the shapes by a module that checks their number of corners or something?
A. You don't have to use all possible orientation. You may choose your own parameter for "angular resolution". But this have to be the first step. From there on, counting the corners may be a plausible solution as some others.

43.  In what way can we use the algorithm we learned for identifying disparity by a peak in the frequency domain of the log of the pictures? can't we first identify the objects and after matching between the objects identified in both eyes - calculate the disparity ?
A. You may use any algorithm studied in class for calculating depth map / disparity map from raw data (NOT from already identified objects) as this simulates better what happens in V1. You may use Yeshurun's and Schwartz's cepstrum, Marr's and Pogio's neural network, or just correlation (calculate correlation of every point and its neighborhood the optional corresponding points and neighborhoods in the other picture, and selecting the one with the highest correlation).

44. Can you please put the output file of the input you've put in the project definition (we want to compare results)?
A. No. You are to print the your results (this will be part of your grade).

45. Is there more input available to test our program ?
A. Currently no. If there will be I'll publish on this site.

46. From where should the job read the input ? In the Project Definition you wrote '~cns00b/input/', but this directory doesn't exist. Did you mean ~/cns00b/input/ ? In that case, can I assume than I am running from ~/cns00b (so I can read from ./input) ? If not, matlab doesn't know the ~ character as a home directory.
A. Your program (when copied to "/cns00b/" of some home dir) should read it's input from the sub directory "./input/".  In this sub directory there will be the image files "Left1.bmp", "Right1.bmp", "Left2.bmp" etc. There will also be a file named "info.txt" containing the number of pairs of images.
Note that through the checking procedure, your entire "~/cns00b/" directory will be copied to the checkers  account and run from there (using input from "./cns00b/input/".
 

47. We are currently debating on an important theoretical issue, regarding the order of processing in the model:
One possibility is to detect edges (of various sizes and lengths) on ALL the picture BEFORE processing. This option may resemble one of the known
functions of V1, (and the entire visual system)  which includes seeking parallel firing from ADJACENT neurons, and hence discovering lines,
angles, and finally - OBJECTS. while this option may seem more logical, it is less efficient in the time needed and entails too much easily avoided
complexity.
The other possibility is to take into consideration, that there are locations in the image, which contain clusters of neurons that fire
simultaneously, since all of them are responding to (approximately) the same visual stimulus. Therefore, we can use these clusters to
first determine that there IS an object there (in that general location). Only the SECOND step will be to identify the object's features - edges,
angels, etc. i.e. - use the function bwselect to select the objects, since the visual system knows there is a unique stimulus at that location.
To sum up, the first possibility entails using edge detection on all the picture from the very beginning, in order to determine where the objects
are.  The second option is dividing the field into distinct areas beforehand, and only then detect edges, angle, etc.
A. Whatever you choose please EXPLAIN why you chose what you chose, what were the alternatives, etc. Note that it is otherwise almost impossible to get this from your code only.
Note that receptors on the retina fire in response to any light (not just defined objects). And some neurons in V1 may fire more in response to darkness.
If you make simplifying assumption which are due to the limited power of the computer and are not part of your model (you would do it in a different way if you had more powerful computer) do WRITE it DOWN and explain.

48. Can there be a situation in which there are same number of objects in both eyes, but these are not the same objects (because one in just leaving and the other is just appearing) ?
A. Yes.

49. Is it ok to assume that the coordinate system is such that the left top pixel of the picture matrix has the coordinates x=1, y=1, the pixel to its
right has the coordinates x=2, y=1, the pixel below it has the coordinates x=1, y=2 etc.?
A. Yes. This will be the coordinates used.

50. Can we assume that the image is always 256x256 pixels?
A. Yes.

51. About the documentation: You wrote that it should not exceed two 2 - sided pages (for pages of print).  How strict is it ?
A. Any work which will exceed this limit by more than one page will lose points. This should be more than enough, as you are asked NOT TO PRINT THE MATLAB CODE.